If you’re an Apache Spark developer, this practical book provides an introduction to the Scala programming language to help you get more out of this framework. Written in Scala, Spark uses its rich Domain-Specific Language (DSL) abilities to present SQL views, extensibility, streaming, and DataFrames. With Scala, you’ll be able to perform on par with Java, and work with distributed systems based on the Java Virtual Machine (JVM).
Spark succeeded mostly because it took very intuitive Scala collections API and made them work on a cluster, unifying the memory of all of its machines to present a coherent view of a "big data" collection. Spark tries to conform to the Scala API as close as possible, and in this book, we take a view that Spark is "simply" distributed Scala. That makes many points of Spark much easier to understand.
"Sinopsis" puede pertenecer a otra edición de este libro.
Fast distributed computing in the enterpriseAbout the Author:
Alexy Khrabrov is the Chief Scientist at Nitro, a web-scale document productivity company running on Scala and Spark. He programs in Scala since 2009, and is the founder of Scala for Startups meetup, merged with SF Scala to become the largest Scala meetup in the world, sfscala.org. Alexy is an early pioneer of Spark, using it since 2012, and the founding co-organizer of Spark Users (sfspark.org), spun off of SF Scala. Alexy is also the founder of SF Text, an applied text mining/NLP/search/AI meetup (sftext.org). Alexy founded and runs three developer conferences in the San Francisco Bay Area — Scala By the Bay (scala.bythebay.io), Text By the Bay (text.bythebay.io), and Big Data Scala By the Bay (bigdatascala.org). He lives in Oakland with his wife and four children.
Andy Petrella is a mathematician turned into a distributed computing entrepreneur, in addition to being a Scala and Spark trainer. Andy participated in many projects built using Spark, Cassandra, and other distributed technologies, in various fields including geospatial, IoT, automotive, and smart cities projects.
Andy is the creator of the Spark Notebook, the only reactive and fully Scala notebook for Apache Spark.
In 2015, Andy founded Data Fellas, working on an integrated and reactive distributed data science toolkit orchestrated from within the Spark Notebook.
Xavier Tordoir started his career as a researcher in experimental physics, focused on data processing. He took part in projects in finance, genomics, and software development for academic research, working on time series, prediction of biological molecular structures and interactions, and applied machine learning methodologies. He developed solutions to manage and process data distributed across data centers.Xavier founded and works at Data Fellas, a company dedicated to distributed computing and advanced analytics, leveraging Scala, Spark, and other distributed technologies.
"Sobre este título" puede pertenecer a otra edición de este libro.
Descripción O'Reilly Media. PAPERBACK. Estado de conservación: New. 1491929286 *BRAND NEW* Ships Same Day or Next!. Nº de ref. de la librería SWATI21FI636288