The article “Using Big Data to Craft the Well-timed Email” explores a better way to use time-stamped transaction data. Thinking about data over the calendar year lends itself to generating forecasts of where sales (or revenue) are going in the coming periods. While these sorts of projection are useful to members of a finance team or to a company’s investors, they do little to help the online retailer hone a marketing strategy. However, if one aggregates time stamped transaction data by the second to analyze how volume and price change over the course of the day, one might be able to know when or on what day of the week to send email promotions.

The analysis conducted with a dataset of roughly 4.2 million time-stamped records from one of our clients.  Given that the intent of the analysis was to investigate variations in shopper behavior by time of day, the first step towards producing meaningful insights was to adjust the timestamps according to the timezone of the customers. While the primary dataset included a field indicating shipping state, the data was neither consistent nor specific enough to properly adjust the time.  As such, a second dataset consisting of 1.4 million records of Google analytics data was used to plot each transaction id according to the lat/long of the transaction city.  These coordinates were spatially joined to a polygon file consisting of world time zones and merged back to the original dataset by transaction id, at which point the timestamps were adjusted.  The final step of data preparation entailed removing observations from before 2006 on account of data-collection irregularities.

The analysis conducted was  primarily exploratory, using plots to drive intuition, and guide future research. For illustrative purposes, the plots displayed below are smoothed lines fitted to the data with a smoothing spline. The first plot shows that sales volume tends to peak right before lunch, remains close to that level throughout the workday, declines through dinner, and peaks again around 10 PM.


Producing the same sort of plot for price over adds more texture to variation in consumer behavior. We see that item price peaks in the morning and trends downward over the course of the day.



When this analysis was conducted for each time zone independently, we saw similar patterns across all time zones. Similarly, when volume and price were plotted by time of day and day of the week, weekday behavior was relatively consistent, with a notably lacking evening peak on Friday and morning peak on Sunday.

While the plots produced above are not intended to be an in-depth analysis of customer behavior by time of day and day of the week, the intuitions stemming from them are a good first step towards improving marketing efficacy. Additionally, this exploratory analysis will drive more in-depth research into the variations in consumer behavior by time of day.