/
Pulling Web Logs from Google Analytics

Pulling Web Logs from Google Analytics

Purpose: This procedure describes the process for extracting web logs from Google Analytics for USU Libraries' Encore instance. Please note that theses procedures include steps that are preparatory for the MARC Discoverability project, but that may be modified easily for other projects that start with web logs.


Tools Needed:

Airtable

Google Analytics (with permissions)

Excel


Procedures:

  1. Open Google Analytics page for the Library Encore (Exclude Internal)

  2. Pull data for a single day by going to:

    1. Behavior -> Site Content -> All Pages

    2. Limit to a single day (or appropriate time frame as determined by the project)

    3. Set the number of records shown to be above the number of records listed for that day

    4. Export into an Excel Sheet

  3. Pull the timed data by going to the Customization _> Custom reports in Google Analytics (Note: This custom report was created specifically for the MARC Discoverabilty project. It includes the hour and minute in which a URL was accessed, along with the number of times it was accessed in that minute.)

    1. Click on Search Patterns report and change the date to match the first report.

    2. Export from Google Analytics

  4. Open the first Excel export

    1. Sort the second tab (Dataset1) by the Page column

    2. Add a tab and label it “Time Searched”

  5. Open the second Excel export (time based)

    1. In column E, put the following formula, substituting the “DATE” for the date selected in the previous steps.  Structure the date in MM/DD/YYYY format:

      1. =concatenate("DATE ",B2,":",C2)

      2. Example:  =concatenate("04/09/2020 ",B2,":",C2)

    2. Copy the formula down for all entries.

  6. Copy the data from the second (time based) Excel export and paste into a fourth tab in the first export called ‘Time Based’

  7. Using a V-lookup formula, associate the URL with the time it was searched

    1. In column I, label the column “Time Searched”

    2. In cell I2, input the vlookup formula, substituting the range in Time Searched with the appropriate values for the range in that sheet:

      1. =VLOOKUP(A2,'Time Searched'!FIRSTCELL:LASTCELL,5,1)

      2. Example:    =VLOOKUP(A2,'Time Searched'!A2:E1118,5,1)

    3. Copy formula down for all the URLS

    4. Copy the Column I and paste as a value to get rid of the formula

    5. Scan for an missing values.  For all found, review the list of URLs in the Time Search and find the correct time

  8. In the first export, copy all URLs (along with their analytics metrics and time searches)

  9. Open the Discoverability Research database in Airtable

    1. Paste the URLS, Time Searched, and analytics data into the URLS tab

    2. Link all new entries to the “Batch” tab, by indicating the Batch Number in the “Batch #” column

  10. Update the Link type column:

    1. For all URLS with a C__S, indicate “Search”

    2. For all URLS with C__R, indicate “Record”

  11. Using the Excel file, create a new tab called “URLS to process”.  Copy the URLS from Dataset2 tab into this file.  Concatenate the URLS with the prefix “https://discover.lib.usu.edu” .  Copy and paste the results as values. 

Related content

Pulling Web Logs from Google Analytics for EAD Guides
Pulling Web Logs from Google Analytics for EAD Guides
More like this
Pulling Web Logs from Google Analytics and webscraping with Octoparse for CONTENTdm Digital Collections
Pulling Web Logs from Google Analytics and webscraping with Octoparse for CONTENTdm Digital Collections
More like this
Coding Web Logs for CONTENTdm Digital Collections
Coding Web Logs for CONTENTdm Digital Collections
More like this