Modernizing NCES’s Annual Reports (ARIS)

The AIR Annual Reports and Information Staff (ARIS) team has supported NCES in developing, producing, and disseminating high-quality, timely, and reliable statistical reports and web products on education topics of relevance to policy makers, researchers, and the general public. Our team has authored and produced several of the most widely used NCES publications (e.g., The Condition of Education (COE), The Digest of Education Statistics (Digest), and Indicators of School Crime and Safety). In addition to these annual reports, AIR also has been responsible for a variety of other recurring and one-time analytic, indicator, and tabular reports on education statistics (e.g., Study of the Title I, Part A Grant Program Mathematical Formulas and Student Access to Digital Learning Resources Outside of the Classroom).

Image
ARIS graphic

The work conducted under ARIS covers a wide range of subject areas related to key education topics (e.g., achievement gaps and college affordability Postsec affordability) and a wide range of data sources (e.g., sample, administrative, and assessment data including ECLS, HSLS, EDFacts, CCD, CRDC, IPDES, ACS, CPS). AIR staff also provide outreach support in various formats of social media postings, graphic, and video materials to reach diverse audiences. More recently, AIR has been supporting the development of the modernization and automation of ARIS and its various data products.

AIR has developed the architecture and wireframes for a COE Web Portal that allows users easy access to all ARIS data products via relevant topic webpages. AIR is currently working on refining the database for all Digest data table components to support the development of interactive visual graphics, which was first released for 10 COE indicators in May 2022 and has continued to expand to a larger number of indicators for COE 2023 in May 2023. In addition to the updating of COE indicators to interactive figures, AIR has also been working on the development of the Digest State Dashboard. This dashboard is a new initiative that will provide a centralized location for users to access state-level education statistics. Data will be drawn from existing Digest tables and will cover a wide range of topics, including enrollments, teachers, graduation rates, standardized assessment scores, finances, and postsecondary tuition and fees.

The Challenge

In our current work towards modernization of the COE indicators, we are tasked with taking data from a wide range of data sources and putting it all into a database to streamline the process of creating interactive figures for use in the COE indicators and the Digest State Dashboard. To do this, we construct rules that the figures must follow to work towards a goal of automation of the creation of these figures. A significant portion of the project's challenges revolves around managing the diverse exceptions found in figures and tables, requiring the implementation of solutions capable of efficiently handling these exceptions while simultaneously managing a substantial workload. Our top priority is to respond to quick turnaround requests to get these COE indicators updated with new data and interactive features, while ensuring that high-quality statistical analyses and reporting procedures are applied consistently across tasks.

Our Role

AIR was tasked with creating a data pipeline that would streamline the multi-step data preparation, data analysis, and figure generation processes, working towards a goal of automation in the creation of interactive figures within the COE releases. In response to the challenge, the AIR ARIS team designed the machine-readable table (MRT) structure and created MRTs through web-scraping techniques for the current and future Digest tables. The MRT has a standardized format and facilitates access to and use of Digest table data by software programs to meet the requirements of the Open Data Act and the Foundations for Evidence-Based Policymaking Act. The ARIS modernization team’s data pipeline processes raw input files—such as CCD, IPEDS and CPS, cleans the content, and loads the resulting data into a database to be used in the generation of figures for the COE indicators. A web interface allows users to interact with the data to make further adjustments to generate public facing data products. The infrastructure was built using Azure, and the database is restricted to users with the correct credentials and a connection to the company VPN.

Outcome

AIR’s development of the MRT system and data pipeline has supported NCES in increasing the usability and accessibility of current and future Digest table data. The data pipeline has also allowed for the accurate and efficient creation of figures for use in COE indicators, with new interactive features previously not available to expand explorative opportunities within the data. We have now added interactive functionality to around 200 COE indicators, with more to come in the future.

Considerations for Diversity, Equity, Inclusion, and Accessibility

The primary objective behind the development of MRTs was to enhance the accessibility of tabulated data sources, which were previously challenging to analyze using computer software. Moreover, the figures generated within the COE indicators are designed to facilitate users in extracting desired information from the data and customizing the figures according to their specific requirements, ensuring greater ease of use. All COE indicators have accompanying 508c PDFs to make them more accessible to individuals with disabilities. Starting this edition, COE indicators have also started to change the customary racial/ethnic ordering of “White, Black, Hispanic…” to the more equitable alphabetical ordering of the groups.