Towards dynamic SQL compilation in Apache Spark

Schiavio, Filippo; Bonetta, Daniele; Binder, Walter

doi:10.1145/3397537.3397566

Back

Conference paper (in proceedings)

Towards dynamic SQL compilation in Apache Spark

2020

Published in:

Companion Proceedings of the 4th International Conference on the Art, Science, and Engineering of Programming (<Programming’20> Companion), March 23–26, 2020, Porto, Portugal. - 2020, p. 4 p.

English Big-data systems have gained significant momentum, and Apache Spark is becoming a de-facto standard for modern data analytics. Spark relies on code generation to optimize the execution performance of SQL queries on a variety of data sources. Despite its already efficient runtime, Spark’s code generation suffers from significant runtime overheads related to data de-serialization during query execution. Such performance penalty can be significant, especially when applications operate on human-readable data formats such as CSV or JSON.

Collections

USI Faculty of Informatics

Language

English

Classification

Computer science and technology

License

License undefined

Open access status

green

Identifiers

DOI 10.1145/3397537.3397566
ARK ark:/12658/srd1324935

Persistent URL

https://n2t.net/ark:/12658/srd1324935

Statistics

Document views: 212 File downloads:

Schiavio_2020_ACM_Programming20: 346

Conference paper (in proceedings)

Towards dynamic SQL compilation in Apache Spark

Apache Spark SQL

SQL Compilation

Dynamic Compilation

Statistics