Web import settings

Aug 14, 2017

The Settings dialog box contains general AppDNA options. To open this dialog box, choose Edit > Settings from the menus.

Use the Web Import page of the Settings dialog box to customize the import of web applications through the Import web applications screen.

After making changes in this dialog, click Save to preserve your changes.

Note: With one or two exceptions, the Web Import settings correspond to the settings available on the General Settings and Spider Settings tabs in the stand-alone spider.

Direct Import tab

These options relate to the Web Direct Import tab in the Import Web Applications screen.

Simultaneous Imports (1-20). This controls the number of imports that take place simultaneously. The optimum value is dependent on your hardware configuration. If you increase the value of this setting and then find that imports fail with a “deadlock” error, decrease it. The default and generally recommended setting is 3.

Preserve Log Files. Select this check box to save log files. This can sometimes be useful for diagnostic purposes. Clear this check box if you do not want to save log files, for example, to save disk space.

Web Spider tab

These options relate to the Web Capture Import tab in the Import Web Applications screen.

Browser timeout. The length of time in seconds that you want the spider to wait for a page to load before ignoring it and moving on to the next page (when running the spider in automatic mode). When you run the spider in manual mode, this setting is used for the first page only. The default is 15 seconds.

Site traversal depth. Specify the link depth that you want the spider to follow. For example, if you specify a depth of 1, the spider starts on the site’s index page and looks to see how many links it contains and visits each of those links. If one of those links contains further links, the spider visits them if the depth is set to a depth of 2 or more. The default is 25.

Automatically close dialog boxes and popups. Select this check box if you want the spider to automatically close dialog boxes that it encounters when running in automatic mode. This is useful, for example, if you want to leave the import running unattended. Clear this check box if you want the spider to wait for you to close dialog boxes manually.

Restrict web app to its virtual directory. Select this check box if you want the spider to ignore any links outside of the web application’s virtual directory (for example, http://myserver/myWebApp). This is useful when there are multiple web applications on the same server and each one is accessed by a different part of the URL. Clear this check box if you want the spider to follow links outside of the virtual directory.

Include sub-domains. Select this check box if you want the spider to follow links to sub-domains of the main domain (for example, http://staging.dev.myserver/myWebApp). Make sure you select this check box if the web application redirects to a sub-domain of the main domain. Clear this check box if you want the spider to ignore links to sub-domains.

Form User Interaction. Select this check box if you want the spider (when running in automatic mode) to stop on every page that has a form and prompt you to fill it in. This is particularly useful when the web application has pages that require the user to login. When this option is selected and the spider detects a form on a webpage, it opens a dialog box and highlights the form input boxes in yellow. For more information, see Web Capture Processing.

Allow Proxy Authentication Prompt. Select this check box if your LAN is configured to use a proxy server and you have selected the Automatically close dialog boxes check box. This means that the spider waits for you to fill in your login information in the authentication dialog box. Clear this check box if your LAN is not configured to use a proxy server.

Allow capture duplicates. This setting affects the spider when running in manual mode only. Select this check box if you want the spider to capture the same page more than once if the page changes. This is useful when capturing web applications that make use of JavaScript and related technologies (such as AJAX) to modify pages after they are loaded. After you select this check box, configure the option with the following:

  • Duplicates count for URL. Enter the maximum number of times you want the spider to capture a page.
  • Duplicates diff ratio. Enter the percentage by which the page must change in order for it to be captured again.

Capture results output directory. Specifies the location of the captured results. This is where you can find the generated MSI files and the captured webpages. You normally only need to use these files when you use the stand-alone web application capture tools.

Allowed external domains. A list of external domains that you want the spider to follow links to.

Domain. Specify the web application domain here and click Add to add it to the list of allowed external domains. If the web application redirects to a different domain, enter that domain here. Similarly if an external authentication server that is in a different domain is used, enter that domain here.