User Tools

Site Tools


cb_mirror_public:automated_duplicates_processing_excel_file_guide_docx_files_22299

Title: Automated duplicates processing Excel file GUIDE

Original CoS Document (slug): automated-duplicates-processing-excel-file-guide

Login Required to view? No

Attached File: How_to_setup_the_CleanUp_Excel_Automation_File.docx

Set up guide for the Excel file that screen-scrapes duplicate profiles and automatically merges or “Mark as Different Person” each duplicate pair

Created: 2023-10-31 17:21:02

Updated: 2023-10-31 17:21:10

Published: 2023-10-31 03:00:00

Converted: 2025-04-14T20:14:14.286784070


The CleanUp Excel automation file uses Visual Basic for Applications (VBA) to search, check, and replace bad profile information in Citizen Builder. The VBA originally used only Microsoft libraries, thus depended on a native Internet Explorer library to interface with the COS website.

Most web sites have given up supporting Internet Explorer, since Microsoft no longer supports it. There are some parts of the COS website that cannot be accessed with Internet Explorer, so a different form of access is required.

A robust screen-scraping tool called UI.Vision RPA, closely related to a tool called Selenium, have been available for 5-10 years. The two are nearly identical, so will just be called “Selenium” here. Neither was designed for use with VBA, although they work with a half-dozen languages. A user called florentbr created a library that allows Basic (VB, VBA, VBScript) access to the web using a variety of browsers.

These instructions describe how to make this VBA library available to our CleanUp tool. This Youtube video, or https://www.youtube.com/watch?v=2jZGhKugK70&ab_channel=GoveAllen also describes the below.

  1. Get the Selenium Basic code

You will see something like this:

<HTML><ol start=“2” style=“list-style-type: decimal;”></HTML> <HTML><li></HTML><HTML><p></HTML>Click the file name under Releases to download the file.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Once downloaded, double-click the exe file to install it.<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

<HTML><ol start=“2” style=“list-style-type: upper-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>
Get the Chrome Driver
<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

Selenium Basic is designed to work with a “WebDriver” standard created by the Selenium team. The Selenium team keeps a set of browser-specific drivers updated so that applications using WebDriver (such as Selenium Basic) will work with any new browser version as it is released.

The CleanUp file assumes we use Google Chrome, so we only have to install a single driver.

  1. We first need to find the Chrome browser version. Note that this can be updated in the background by Google at any time!
  2. Open Chrome and look in the upper right corner:
  3. Click on the three dots, then Help, then About Google Chrome.
  4. Note at least the first few digits of the Version. Latest: Version  113.0.5672.126 (Official Releases) (32-bit)

Now we download the proper chromedriver version.

The complete list of all browser drivers is listed here:

https://www.selenium.dev/documentation/en/getting_started_with_webdriver/browsers/.

<HTML><ol start=“5” style=“list-style-type: decimal;”></HTML> <HTML><li></HTML><HTML><p></HTML>As a shortcut, go directly to the list of chromedrivers here: https://sites.google.com/chromium.org/driver/<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

<HTML><ol start=“6” style=“list-style-type: decimal;”></HTML> <HTML><li></HTML><HTML><p></HTML>Click on the link to the matching driver, typically the “latest stable release” (e.g. “ChromeDriver 113.0.5672.126”) (Note that this is an old version. Yours will be different). It will open up a list of drivers for various Operating Systems.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Download the chromedriver file for Windows. Its file name will be “chromedriver_win32.zip”.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>It is a good practice to append the version number to this zip file, in our example above this would be “chromedriver_win32 113.0.5672.126.zip. It will help you keep track in the future.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Be aware that more than the first two digits determine whether your chromedriver is the right one. The chrome connection can break if, in the example above, anything after the 113.0. changes. But not always. The macro will provide a message if the chromedriver is out of sync.<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

Next: put the Chrome driver in its place


Put the Chrome Driver in the right folder

  1. Open File Explorer, then the View tab;
  2. Make sure that “Hidden Items” is checked.
  3. In the left column of File Explorer, open your C: drive, then the Users folder. Note the folder name that looks something like your name under the Users folder. This is your USERNAME.
  4. Then go here in File Explorer:

C:\Users\USERNAME\AppData\Local\SeleniumBasic, substituting YOUR user name.

<HTML><ol start=“5” style=“list-style-type: decimal;”></HTML> <HTML><li></HTML><HTML><p></HTML>Copy this path. This is where you will need to put the chrome driver (next steps)<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Now unzip the downloaded zip file, “chromedriver_win32.zip”<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Click “Extract to” then paste the path you just copied into the Destination path textbox. Accept the defaults in the rest of the dialog and click OK.<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

The result in File Explorer will look like this:

<HTML><ol start=“8” style=“list-style-type: decimal;”></HTML> <HTML><li></HTML><HTML><p></HTML>If the file is not allowed to be copied to this location because administrator permission is needed,<HTML></p></HTML> <HTML><ol style=“list-style-type: lower-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>Extract chromedriver.exe to your Downloads folder<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Open the Start Menu<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Click in some blank space, then enter “Command” to get the Command prompt item in the top left area<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Click on “Run as Administrator” in the right pane.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Type something like this; you will need to replace “bill” with your username. There is a space between the source and destination paths. “chromedriver.exe” is all lower case.<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML> <HTML></li></HTML><HTML></ol></HTML>

C:\WINDOWS\system32>copy c:\Users\bill\Downloads\chromedriver.exe c:\Users\bill\AppData\SeleniumBasic

<HTML><ol start=“9” style=“list-style-type: decimal;”></HTML> <HTML><li></HTML><HTML><p></HTML>If the file cannot be pasted because the file is in use:<HTML></p></HTML> <HTML><ol style=“list-style-type: lower-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>Close Chrome<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Close Cleanup Prod file<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML> <HTML></li></HTML><HTML></ol></HTML>

<HTML><ol start=“3” style=“list-style-type: upper-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>Update Path Environment variable to include the path to the Chrome.exe file<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

The Path to Chrome in Windows 10 is C:\Program Files (x86)\Google\Chrome\Application

The Path to Chrome in Windows 11 is C:\Program Files\Google\Chrome\Application

Having the Chrome path in the Environment variables ensures that Chrome will open no matter where it is opened from. You don’t normally have to worry about this when opening a browser manually, but we are doing this with the CleanUp program.

  1. Verify that this is the path. Open Start, type “Chrome”. Right click on Chrome and choose “Open File Location”. This opens the file location of the shortcut to Chrome. Right click on the shortcut and again choose “Open File Location”. The result should be the path above.
  2. Here is the way to add Chrome to the path safely.
    1. Go to Start, Control Panel, System.
    2. Scroll the right pane to the bottom, click on “Advanced system settings”, which opens a popup.
    3. Click the “Environment Variables…” button, which opens another popup.
    4. In the bottom “System variables” section, double-click on “Path”, which opens another popup.
    5. If there is a Chrome path that doesn’t match the Path above, delete it.
    6. Once there is no Chrome path, click New and enter the path above, then OK
    7. Note: Typing “Path” in the Command prompt will not likely show that the path has been added, since the text returned to the Path command is truncated to 1024 characters.


<HTML><ol start=“4” style=“list-style-type: upper-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>Open Excel to make sure Selenium VBA is active<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

  1. In the Cleanup file, first make sure you can access the Visual Basic environment
    1. Click File, Options, Customize Ribbon
    2. In the right pane, make sure “Developer” Main Tab is checked, then click OK
  2. Click on Developer Tab at the top of the Excel screen, then click on Visual Basic
    1. In the Visual Basic environment, click on Tools, then References…
    2. Make sure “Selenium Type Library” is checked. If it isn’t scroll down to find and check it.
  3. Make sure .Net Framework 3.5 is installed
    1. To see if it is enabled, click on the Windows Key , click in some blank space and type “Windows Features”. Select “Windows Features On or Off, control panel. It should be checked or have a black box next to it.

<HTML><ol start=“5” style=“list-style-type: upper-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>Always open Chrome in Debug mode<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

There are two ways to do this.

  1. Go to the Chrome link above that you reached the FIRST time you opened the file location. Right click the file name, then choose properties.
    1. In the Shortcut tab (which should be the default), in the Target text box, add after the last double quote, a space, then this text:
–remote-debugging-port=9222

<HTML><ol start=“2” style=“list-style-type: lower-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>Note the TWO dashes at the beginning.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>You will be asked to save it as an administrator. If you can, you’re done.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>It would be unusual if Chrome is ever opened when not in remote debugging mode.<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

<HTML><ol start=“2” style=“list-style-type: decimal;”></HTML> <HTML><li></HTML><HTML><p></HTML>If you could not save the new target for the shortcut, go to the Chrome link above that you reached the SECOND time you opened the file location.<HTML></p></HTML> <HTML><ol style=“list-style-type: lower-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>Right click the filename, and “Send to” the “desktop (create shortcut)”.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>On the desktop, right click the Chrome icon and click Properties.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Follow the steps a and b above.<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>You will always want to open Chrome with this link.<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML> <HTML></li></HTML><HTML></ol></HTML>

<HTML><ol start=“6” style=“list-style-type: upper-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>Malwarebytes settings<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

If Malwarebytes is installed on your machine, you will need to modify its settings.
  1. Make sure Excel is open
  2. Click on gear icon, which is settings
  3. Click on Network
  4. Click on the arrow to the right of “Control how individual items interact with the VPN”
  5. Click “Bypass VPN”
  6. Click ‘Add item” button, then “Application”, then scroll down to “Excel.exe” and select it.
    1. “C:\Program Files\Microsoft Office\root\Office16\EXCEL.EXE”

<HTML><ol start=“7” style=“list-style-type: upper-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>Alternative Malwarebytes settings<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

  1. Click on gear icon, which is settings
  2. Click on Allow List
  3. Click Add, then Application, then Browse
  4. Enter path to Excel execution file.
    1. “C:\Program Files\Microsoft Office\root\Office16\EXCEL.EXE”


<HTML><ol start=“8” style=“list-style-type: upper-alpha;”></HTML> <HTML><li></HTML><HTML><p></HTML>What do I do with this error?<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

Every once in while you will see an error when running the Duplicates program with this text:

<HTML><ol style=“list-style-type: decimal;”></HTML> <HTML><li></HTML» <HTML><p></HTML>“Chromedriver version is likely out of sync with your version of Chrome”<HTML></p></HTML> <HTML></li></HTML><HTML></ol></HTML>

This error might occur once every month or quarter, whenever Google updates chromedriver. The error occurs because your Chrome browser is automatically updated in the background, thus making the version of the Chrome browser different than the version of chromedriver.

The solution is to repeat the steps in sections B. and C. above.

<HTML><ol start=“2” style=“list-style-type: decimal;”></HTML> <HTML><li></HTML» <HTML><p></HTML>Checkboxes, radio buttons, and command buttons do not respond to Clicks in code.<HTML></p></HTML> <HTML></li></HTML><HTML></ol></HTML>

Sometimes, a statement like this will not work:

drv.FindElementByXPath(“/html/body/main/div/form/table/tbody/tr[2]/td[2]/div/div/label”).click

The problem is that some object in the page is lying on top of the element that is being clicked. This is peculiar to SeleniumBasic.

The solution is to click these elements with javaScript inside a Selenium Basic ExecuteScript statement:

Set Ele = drv.FindElementByXPath(“/html/body/main/div/form/table/tbody/tr[2]/td[2]/div/div/label”)

Rtn = drv.ExecuteScript(“arguments[0].click();”, Ele)
cb_mirror_public/automated_duplicates_processing_excel_file_guide_docx_files_22299.txt · Last modified: 2025/04/14 20:14 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki