Pulling Web Logs from Google Analytics
Purpose: This procedure describes the process for extracting web logs from Google Analytics for USU Libraries' Encore instance. Please note that theses procedures include steps that are preparatory for the MARC Discoverability project, but that may be modified easily for other projects that start with web logs.
Tools Needed:
Airtable
Google Analytics (with permissions)
Excel
Procedures:
Open Google Analytics page for the Library Encore (Exclude Internal)
Pull data for a single day by going to:
Behavior -> Site Content -> All Pages
Limit to a single day (or appropriate time frame as determined by the project)
Set the number of records shown to be above the number of records listed for that day
Export into an Excel Sheet
Pull the timed data by going to the Customization _> Custom reports in Google Analytics (Note: This custom report was created specifically for the MARC Discoverabilty project. It includes the hour and minute in which a URL was accessed, along with the number of times it was accessed in that minute.)
Click on Search Patterns report and change the date to match the first report.
Export from Google Analytics
Open the first Excel export
Sort the second tab (Dataset1) by the Page column
Add a tab and label it “Time Searched”
Open the second Excel export (time based)
In column E, put the following formula, substituting the “DATE” for the date selected in the previous steps. Structure the date in MM/DD/YYYY format:
=concatenate("DATE ",B2,":",C2)
Example: =concatenate("04/09/2020 ",B2,":",C2)
Copy the formula down for all entries.
Copy the data from the second (time based) Excel export and paste into a fourth tab in the first export called ‘Time Based’
Using a V-lookup formula, associate the URL with the time it was searched
In column I, label the column “Time Searched”
In cell I2, input the vlookup formula, substituting the range in Time Searched with the appropriate values for the range in that sheet:
=VLOOKUP(A2,'Time Searched'!FIRSTCELL:LASTCELL,5,1)
Example: =VLOOKUP(A2,'Time Searched'!A2:E1118,5,1)
Copy formula down for all the URLS
Copy the Column I and paste as a value to get rid of the formula
Scan for an missing values. For all found, review the list of URLs in the Time Search and find the correct time
In the first export, copy all URLs (along with their analytics metrics and time searches)
Open the Discoverability Research database in Airtable
Paste the URLS, Time Searched, and analytics data into the URLS tab
Link all new entries to the “Batch” tab, by indicating the Batch Number in the “Batch #” column
Update the Link type column:
For all URLS with a C__S, indicate “Search”
For all URLS with C__R, indicate “Record”
Using the Excel file, create a new tab called “URLS to process”. Copy the URLS from Dataset2 tab into this file. Concatenate the URLS with the prefix “https://discover.lib.usu.edu” . Copy and paste the results as values.