When it comes to higher education, Californians have a lot of choices, with 10 University of California (UC) campuses, 23 California State Universities (CSU), and 116 community colleges. The reason why the state boasts the largest, most comprehensive higher education ecosystem in the nation is due in large part to the visionary California Master Plan for Higher Education. However, when it comes to meeting the workforce needs of the 21st-century economy powered by data and AI, there is no comprehensive strategy amongst the state’s public universities.
In the fast-emerging field of data science -- with most schools creating their own teaching structures, majors, and minors -- building a new program from scratch can be challenging. The same is true for transferring course credit from community college to a UC/CSU and setting up cloud infrastructure for a course. The California Alliance for Data Science Education is a new initiative spearheaded by UC Berkeley’s Division of Computing, Data Science, and Society convening institutions of higher education from around the state of California to solve these bottlenecks and challenges. It has several key focus areas such as:
- Training instructors regarding data science pedagogy, highlighting Berkeley’s Data 8 model along with the experiences of other institutions.
- Providing cloud-computing education infrastructure through other UC Berkeley-led initiatives such as 2i2c (International Interactive Computing Collaboration) and Cloudbank.
- Streamlining articulation, in which a student taking a data science class at one institution can guarantee acceptance of credit for data science coursework when transferring to another school.
Articulation is one of the biggest bottlenecks in advancing data science education statewide. Currently, 30 percent of UC students transfer from community colleges, taking required general education classes before transferring to a UC or CSU campus. In many academic fields, education systems have agreed on certain course requirements and award credits for classes taken. Without proper transfer credit protocols, community colleges will not have an incentive to take the steps to create data science courses. In turn, students will be deprived of learning about this pathway which can lead to a wide range of career options.
The Alliance steering committee currently comprises six UC campuses, five Cal State universities, four community colleges, and one private college with the collective goal to broaden access to data science education across the state. Eventually, the group also wants to bring in the industry to learn what skills businesses are looking for and increase the number of internships.
CA Alliance Panel Discussion
Representatives from some of the schools belonging to the 16-institution alliance gave an update on programs and procedures during a June 16 panel discussion at the online National Workshop on Data Science Education, organized by UC Berkeley’s Division of Computing, Data Science, and Society (CDSS).
University of California Overview
The lively panel discussion illustrated the disparate approaches that member institutions have developed and are planning to expand their data science programs. Aaron Fraenkel, an assistant teaching professor and chair of the undergraduate data science program at UC San Diego, gave an overview of the offerings across the UC system. One challenge facing all of the schools is recruiting faculty to teach the courses.
“It’s hard to staff math and science, to begin with,” said Fraenkel. “Combining those makes it even more difficult.”
Among the UC programs, Berkeley’s Data Science for Undergraduate Studies is the largest, with 1,100 majors and 600 minors. As part of CDSS, the program leverages instructors from Statistics and Computer Science to teach the core data classes and a growing number of electives. By contrast, UC San Diego has an undergraduate program housed within a Data Science Institute with its own faculty, with about 800 majors and 170 minors. UC Irvine has had a data science major since 2015 with 157 students currently. UC Riverside is the newest addition, starting with a smaller cohort of 20 students.
“Berkeley is leading the way in a broad variety of ways,” Fraenkel said. He cited the Data 8 introductory class which has been emulated by schools around the country, the growing number of electives, the connector courses bridging disciplines, and the modules created to drop data science lessons into other courses.
California State University Overview
Hunter Glanz, an associate professor of Statistics at Cal Poly San Luis Obispo, pointed out that the CSU system is the nation’s largest four-year public university with 500,000 students and 4 million alumni. Cal Poly created an innovative cross-disciplinary study minor in data science in 2014 and now has a data science component for all courses in the pipeline, as well as a traditional minor and bachelor’s program in data science.
The goal is to create a “fully connected, campus-wide data science community” bridging pedagogy and research, Glanz said, and “reducing the ‘siloed-ness’ of data science.
Up the coast at Cal State Monterey Bay, the data science minor is concentrated in computer science and requires two semesters of programming, said Statistics Professor Judith Canner. The school recently approved a more collaborative minor that’s not just in computer science and statistics. “We want something accessible to students from any major,” Canner said, and the goal is to embed as much data science into the general education classes as possible.
California Community College Overview
At the community college level, City College of San Francisco (CCSF), Skyline College in San Bruno and Santa Barbara Community College (SBCC) are introducing new classes based on Berkeley’s Data 8 introductory course. At CCSF, the course is open to any student, said Mathematics Professor Katia Fuchs, and the school is trying to recruit embedded tutors to support the students. Most of the students take courses to meet UC or CSU requirements, but Fuchs said it can take two to three years to develop a class that articulates to one of the state universities.
Nathalie Guebels, chair of the SBCC Computer Science Department, also cited the challenge of finding instructors. She said most faculty already teach three full classes and don’t have time to learn another curriculum, leading the school to look for funding in order to add new data classes and train faculty. She agreed with Fuchs on the need to provide support for students and make sure they thrive.
SBCC is also a member of the National Science Foundation-funded Central Coast Data Science Partnership, which also includes Cal Poly, UC Santa Barbara and Cal State San Bernardino. The program establishes pathways for data science training through coursework and real-world projects, connecting three main public higher education institutions in California.
Denise Hum, a professor of science, math, and technology at Skyline College, said the school is offering its first Data 8-based course this summer to 25 students. Starting in the fall, the class will be transferable to state universities, she said, as a result of help from Cal State East Bay.
“The panel showed the incredible breadth and growth of data science education in California,” said Anthony Suen, co-founder of the California Alliance and Director of Programs for Berkeley’s Data Science Undergraduate Studies. “We hope to build on its success to inspire more educators from across the state to develop their own data science/analytics programs based on the Berkeley model. Training students in this increasingly important field is essential to helping California to maintain its lead as a global hub of innovation.”
Watch the California Alliance for Data Science Education panel discussion.