Name: CIS Seminar: “Prioritizing Computation and Analyst Resources in Large-scale Data Analytics”
Start: 2021-03-23T15:00:00-04:00
End: 2021-03-23T16:00:00-04:00
Location: Zoom – Email CIS for link

CIS Seminar: “Prioritizing Computation and Analyst Resources in Large-scale Data Analytics”

March 23, 2021 at 3:00 PM - 4:00 PM

Share this event

Add to Calendar

Details

Date: March 23, 2021

Time: 3:00 PM - 4:00 PM

Event Tags:CIS

Organizer

Computer and Information Science

Phone: 215-898-8560

Email: cherylh@cis.upenn.edu

Website: View Organizer Website

Venue

Zoom – Email CIS for link cherylh@cis.upenn.edu

Google Map

Data volumes are growing exponentially, fueled by an increased number of automated processes such as sensors and devices. Meanwhile, the computational power available for processing this data – as well as analysts’ ability to interpret it – remain limited. As a result, database systems must evolve to address these new bottlenecks in analytics. In my work, I ask: how can we adapt classic ideas from database query processing to modern compute- and analyst-limited data analytics?

In this talk, I will discuss the potential for this kind of systems development through the lens of several practical systems I have developed. By drawing insights from database query optimization, such as pushing workload- and domain-specific filtering, aggregation, and sampling into core analytics workflows, we can dramatically improve the efficiency of analytics at scale. I will illustrate these ideas by focusing on two systems — one designed to optimize visualizations for streaming infrastructure and application telemetry and one designed for high-volume seismic waveform analysis — both of which have been field-tested at scale. I will also discuss lessons from production deployments at companies including Datadog, Microsoft, Google and Facebook.

CIS Seminar: “Prioritizing Computation and Analyst Resources in Large-scale Data Analytics”

March 23, 2021 at 3:00 PM - 4:00 PM

Details

Organizer

Venue

Read More