Timely and cost-effective analytics over â€œBig Dataâ€ is now a key ingredient for success in many businesses, scientiï¬c and engineering disciplines, and government endeavors. The Hadoop software stackâ€”which consists of an extensible MapReduce execution engine, pluggable distributed storage engines, and a range of procedural to declarative interfacesâ€”is a popular choice for big data analytics. Most practitioners of big data analyticsâ€”like computational scientists, systems researchers, and business analystsâ€”lack the expertise to tune the system to get good performance. Unfortunately, Hadoopâ€™s performance out of the box leaves much to be desired, leading to suboptimal use of resources, time, and money (in payas-you-go clouds). We introduce Starï¬sh, a self-tuning system for big data analytics. Starï¬sh builds on Hadoop while adapting to user needs and system workloads to provide good performance automatically, without any need for users to understand and manipulate the many tuning knobs in Hadoop. While Starï¬shâ€™s system architecture is guided by work on self-tuning database systems, we discuss how new analysis practices over big data pose new challenges; leading us to different design choices in Starï¬sh.