Discrepancy Investigation

An agency or advertiser relies on several vendors to run their campaigns (for example, Ad-server, publishers, DSP). Sizmek classifies every discrepancy between the different vendors as a data discrepancy ticket. The majority of tickets are Sizmek (Ad-server) vs Publisher.

For any kind of discrepancy (more than 10%), we first validate that the Sizmek and the publisher metric in question is based on the same logic.

In cases where IVT affects over 10% of the traffic, we consider this as material to the campaign and will communicate to the client and relevant media source as part of the investigation. The parties can then use the information to cease the traffic source and recover lost media (this is an Advertiser/Publisher conversation that Sizmek can facilitate with data as needed). IVT data investigation queries are usually run on two to four weeks of raw data and can sometimes account for up to three months for both filtered and non-IVT rows. In conjunction with this internal process, advertisers or publishers may challenge the Sizmek IVT metrics and initiate a support investigation at any time using the help link in the Sizmek Ad Suite reporting user interface.

In rare cases of very large discrepancies that have occurred over a long period of time and for large-scale client traffic, Sizmek will re-run and process traffic that was detected as part of the investigation and was not filtered before. This action is due to the high costs and considerable complexity, for all involved parties, of altering past data in all reporting systems and the invoicing systems. In most cases, a separate custom report will be prepared by Sizmek, which includes what is now known as a valid traffic, and a record will be kept in the Global support ticket. Data from this point, and in the future, will be correct in the reporting systems. In many cases, make-good impressions will be required by the advertiser and will be agreed upon by the media source.

  1. Sizmek counts less than the Publisher/Vendor. Steps to validate Sizmek data include the following:

    1. Validate report data vs the raw-data, S3-logs via Zeppelin for Sizmek Ad Suite data. If the data does not match, move to the middle tier (Redshift) and check the data, (reprocess, if necessary).

    2. For impressions discrepancy, ad-request report should be pulled and compared to publisher data.

    3. Check the filtered robots activity:

      • DB filtration: S3-logs via Zeppelin.

      • Server filtration: Checking EDS Hadoop data for filtered activity due to SizmekIP, IABBlacklist, and blank serving.

    4. Ask the publisher for raw data and check trends based on IP, user-agent, and event-time.

      • Ask the publisher for raw data and check trends based on IP, user-agent, and event-time.

  2. Sizmek counts more than the Publisher/Vendor.

    1. Validates report data vs the raw-data: S3-logs using Zeppelin Sizmek Ad Suite data.

      If the data does not match, move to the middle tier (Redshift) and check the data (reprocess, if necessary).

    2. Check for trends on Sizmek raw data, and cluster the database on IP, user-agent, userID, sessionID, OS, browser, and event-time.

      Classic behavior is a specific IP/IPs-range or user-agent that triggered many calls for Sizmek, but did not appear on the Publisher side, and eliminate the suspicious amount and the total amount aligned to the publisher data.

    3. Ask the publisher for raw data to make a complete comparison vs Sizmek raw data (plus robots).

Ticket Closing

  • Data is not removed from the DB once entered; we notify the CS about it.

  • We explain the issue to the CS and deliver a custom report with all the information and data without IP, if required.

  • IP delivery will require justification, in addition to Sizmek upper-level approval.

  • Sizmek raw data will be delivered to the Publisher, if necessary.

  • Billing charge is based on the reports; filtered traffic is not billed.

Queries and Results

You can generate reports by using queries and viewing results in Microsoft Excel®.

S3 DB Robots


Note: This information is available for the placement level only.

val df7 ="json").schema(eventsSchema1).load("s3n://prod-raw-dwh-eu-fr/robots/ServingData/account=112098/event_time=20160918/*");

val TestFiltered1 = df7.filter(df7.col("FlightID").equalTo("17687698"))

Select * from TestTempTable


Relevant Knowledge-Base Articles

Apache Zeppelin

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request