How to: Configure Data Scans

Summary

NSS performs regular scans of paths within target file systems. The data gathered during these file system crawls is used to populate the Console interface. Raw data is collected via SMB and consolidated data (processed to reduce footprint and allow fast client display) is stored to the NSS SQL database.

The content in this article is divided into five different categories in the 'Step-by-step'-section:

Note on Network Overhead of scanning operations: With default settings Network overhead of scanning operations is minimal. The number of scanning threads used in this process can be increased or decreased to allow tuning of overhead and scan speed; increase threads to increase scanning speed at the cost of network overhead, decrease threads to reduce network overhead at the cost of decreased scanning speed. These options are accessed by clicking the 'Performance' option shown in step six of the step-by-step guide. Consult with a Northern Field Engineer if you are in any doubt about the settings that apply in your environment or use case.

Pre-requisites

The NSS Service account should have the relevant privileges to access domain resources, resources of the NSS Managing Host, and target file systems or paths. Backup Administrator rights are recommended, NAS or Unified Storage environments may require that accounts with special privileges are required.

See KB-1745 for reference or consult with a Northern Field Engineer if more information is required.

Step-by-step

Path Configuration

  1. Access NSS Console Administration view and hover the mouse pointer over the Data Scan tab in top menu. Select 'Configuration' in the drop-down menu that becomes visible. This will take you to the the Data Scan Configuration page.

    Expand the 'Path Configuration' tab and click on the 'Add Path'-button on the right-hand side of the page to start configuring the paths to scan.

    Data Scan: Add Path

  2. Specify your path in the Path-field and then set the appopriate number of Levels Down to scan.

    The Levels Down value determines where the scan should start. In the example screenshot below, the value is set to 1 Level Down. This means that the scan will start one level below 'Application', meaning that all subdirectories of the parent folder 'Application' will be individually scanned (i.e. \\SNV-RODA\files\Application\DirA, \\SNV-RODA\files\Application\DirB etc).

    The optimal number of Levels Down is entirely dependent on how the directory structure is configured in the target file system.

    Select an appropriate Path Category (see KB-3118 for more information) for your path in the drop down-list in order to assign a category to the path and its subdirectories. Click on the 'Add Path'-button to add the path to the path list. 

    Data Scan: Add Path 2

  3. Continue to add more paths if needed. In the example below, additional paths have been added and assigned to different Path Categories. When you're finished, click on the 'Save'-button to save your changes.

    Data Scan: Add Multiple Paths

  4. This is the expected result:

    Data Scan: Path Configuration Success

Edit/Delete paths

  1. If you wish to make changes to the Data Scan configuration, simply click on the 'Edit Paths'-button next to the 'Add Path'-button. This will generate a view similar to the one displayed in step two in the Path Configuration process, but without the ability to add new paths. This page will allow you to make changes to the paths, the number of levels down to scan, change path categories and delete paths. Click on 'Save' to apply the changes.

    The example below shows a scenario where the Path Category is changed for one of the paths:

    Data Scan: Edit Path
  2. This is the expected result. The Path Category has been changed from Group Shares to Finance Shares:

    Data Scan: Edit Path - Success

Scheduling

  • Expand the Scheduling tab to configure the Data Scan Schedule. This section allows you to schedule the Data Scan to run on a time and date that best suits your environment and needs. Select an appropriate scan interval and a starting date for the scan. Save the changes by clicking on the Save button.

    Many customers choose a scan interval of seven days to balance database size with data granularity. The default setting is to scan everyday at 9:00 PM (21:00).

    Data Scan: Schedule

Path Exclusion

  1. Expand the Path Exclusion tab to configure exclusions of specific paths and/or strings. This section allows you to configure the Data Scan to ignore paths that are regarded as irrelevant or unneccesary to include in the User Data Management scope. Click on the 'Add Path'-button in the bottom right corner to start configuring the Path Exclusion(s).

    Path Exclusion
  2. A box wil appear that will allow you to use either Pattern Matching or Regular Expressions for your exclusions. Specify your customized exclusion and click on 'Apply' to save the changes. See the screenshot below for more information. In this example, a simple path exclusion of a subdirectory has been configured:

    Path Exclusion Field

  3. The Data Scan will now ignore the path (\\SNV-RODA\file\Application\DirScan), as well as the default 'snapshot' and 'windir' patterns:

    Path Exclusion - Success

Scan History

  • The Scan History page provides an insightful overview of the latest scans performed and the statistics connected to them. This page shows key metrics such as the scan times, the average scan speeds, the number of directories scanned, the number of files scanned, the number of errors encountered etc.

    This information can prove to be invaluable in scenarios where an in-depth analysis is required to understand unexpected behaviors related to the Data Scans.

    Walkthrough of the History page:
  1. Hover your mouse pointer over the Data Scan tab in the NSS Console Administration view and select 'History' in the drop-down  menu that becomes visible. This will take you to the the Scan History page.

    The page displays two dynamic graphs; one displaying the overvall times for the different scanning and storing phases, and the other displaying the number of files scanned, paths scanned, rows stored to the database, the number of errors encountered etc. It's possible to click on the different categories at the bottom of each graph to either include or exclude their respective lines from the graphs.

    Scan History - Graphs
  2. The data displayed in the dynamnic graphs is also presented in a detailed table that can be customized according to personal preference. It's possible to select which columns to display in the table by clicking on one of the arrows next to the table headers and then 'Columns' in the menu that becomes visible.

    The second option in this menu, 'Filter', enables to possibility to filter the table on certain dates to display data for a specific period of time. See the screenshots below for an illustration of this feature:

    Scan History - Stats
    Table columns Table filter
    Scan History: Columns  Scan History: Filter

  3. A reference table for the different column properties can be found below:

    Function Description
    ID

    The ID of the scan/report.

    Host Name
    The name of the Host that carried out the scan in question.

    Job Name


    The name of the scan job (e.g. Host Scan).

    Start Time
    The starting timestamp of the scan job.

    Stop Time
    The stopping timestamp of the scan job.

    Scanned Paths
    The paths scanned.

    Files/second
    The average scan speed in terms of files per second.

    Stored Rows/second
    The average database row storing speed in terms of stored rows per second.

    Path Count
    The number of paths scanned.

    Total Time
    The total scan time (including all phases; preparation, scan, post-scan processing, rebuilding indexes etc.)

    Prepare Time
    The time it took for NSS to prepare the scan before starting it.

    Scan Time
    The time it took for the scan operation to gather the data from the filesystem.

    Post Processing Time
    The time it took for the post-processing activites such as data consolidation and calculation.

    Scanned Dirs
    The number of scanned directories.

    Stored Rows
    The number of rows stored to the database.

    Deleted Rows
    The number of deleted rows from the database.

    Peak Memory Usage
    The highest point of memory consumption on the NSS-server responsible for the scan.

    Peak Thread Count
    The highest number of threads used by the CPU on the NSS-server responsible for the scan.

    Aborted
    A boolean value that indicates if the scan was aborted or not. True = Aborted, False = Not aborted.

    Resumed
    A boolean value that indicates if the scan was resumed from a stopped state or not. True = Resumed, False = Not resumed.

    Merged
    A boolean value that indicates if the scan was merged with another scan. True = Merged, False = Not merged.

    Version
    The version number of the NSS software running on the server responsible for the scan/report.

    Table Name
    -------------

    Post Scan store time
    -------------

    Index re-build time
    The time it took for NSS to rebuild the database indexes.

    Error Count
    The number of errors that occurred during the scan. Click on the magnifying glass next to the number to generate a detailed overview for the errors encountered. (The magnifying glass can be clicked to view more information.)

    DB Retry Count
    The number of database insertion retries.

Confirm results

  1. When you've finished configuring the Data Scan, either wait for the next scheduled run of the scan or click on the 'Run Now'-button to start it immediately.
    Data Scan: Run Now
  2. Verify that the paths are scanned and that rows are stored to the database:

    Data Scan: Status
  3. Wait for the scan to finish and then click on the 'Exit Administration' link to review your results in the graphical interface. The scan is finished when the Status says 'Idle".

    The result interface is divided into four different main sections; Custom, Growth, Age and Cost. Under the Growth, Age and Cost sections you will find the Path Categories that you previously configured when setting up the Data Scan. Each category displays data for the paths associated with it. The 'All repositories' tab shows a summarized overview for all Path Categories that have paths assigned to them.

    If you are unable to see any data or specific data belonging to a certain Path Category, please go to the Access Rule-page and verify that the logged on user has an Access Rule defined that includes the paths in question. 

    Important note: The growth calculations and graphs require multiple scan points to display accurate projections. The reason for this is that these features depend on historical data to display accurate numbers.

    Example of the Default Custom Dashboard:
    Data Scan Results

    Example of the Data Age section (All Repositories):

    Scan Results: Data Age

  4. Go to the Scan History section in the NSS Console or the Event Viewer in Windows to verify that no errors occurred during the scan. No 'Access Denied (Error Code 5)' errors should be seen here. Access Denied errors are usually related to insufficient permissions for the NSS Service Account. Please see the 'Suitable NSS Service Account'-section in  KB-1745 for more information.

For advanced troubleshooting, please contact the Technical Support team at Northern (support@northern.net).

ADDITIONAL RESOURCES

  • KB3118 How to: Create Path Categories
  • KB Article: 3119

    Updated: 12/12/2016

    • Category
      • Usage
    • Affected versions
      • NSS 9.6
      • NSS 9.7
      • NSS 9.8

    North America HQ

    NORTHERN Parklife, Inc.
    301Edgewater Place, Suite 100
    Wakefield, MA 01880
    USA

    Voice: 781.968.5424
    Fax: 781.968.5301

    salesUS@northern.net

     

    Additional Contact Information

    EMEA & APAC HQ

    NORTHERN Parklife AB
    St. Göransgatan 66
    112 33 Stockholm
    Sweden

    Voice: +46 8 457 50 00

    salesHQ@northern.net

    Northern Parklife



    ©2017 northern parklife

    privacy statement 
    terms of use