New Life for Legacy Systems

min read

Replacing a legacy student information system is complex and expensive. The strategies described here offer the potential to eliminate many of the weaknesses of the legacy SIS at a fraction of the cost of system replacement.

cracked and bandaided hourglass on a red background
Credit: Francesco Scatena / iStock / Getty Images Plus © 2019

Nearly every higher education institution depends on a core administrative student information system (SIS). Because the SIS sits at the center of so many of the day-to-day operations of managing students, courses, and grades, it becomes extremely important, expensive to operate, and hard to change. Every software system has limitations, of course, and system administrators soon find they need to either modify the way they work or modify the SIS. Because changing the way people work requires changing human behavior, changing the software is often simpler and more expedient.

Over time, these changes accumulate. Eventually, the resulting complicated and deeply embedded system can no longer support modern interfaces and new ways of doing business. At some point, campus leaders find themselves investing in the complex, risky, expensive, and politically fraught process of replacing their SIS in the hope of providing better service to students, improved access to data, and a more flexible technology environment for the future.

It's too soon to know whether or not the emerging generation of Software-as-a-Service (SaaS) systems will meet the promise of better service and longer-lasting technology, since the long-term benefits are not yet proven. However, the decision to replace an SIS generates unavoidable costs—primarily financial, but also political—especially at a larger university or university system. Every higher education leader understands that financial and human resources will always be limited and that the money and energy needed to replace an SIS leave fewer resources to invest elsewhere. But eventually legacy software no longer matches campus need, and the drumbeat of "replace the SIS" becomes too loud to ignore.

The California State University (CSU) operates a complex and expensive SIS environment, with a single software system currently servicing 22 campuses (23 by the end of 2021). While these campuses all use one product (PeopleSoft from Oracle), the code base for the system is modified first by the central system office and then by the various campuses, representing 15 years of accumulated changes in some cases. In this environment, there's a natural appeal to the idea of tossing it all out and starting over. However, after a careful review, technology, administrative, and academic leaders decided in 2018 to postpone consideration of a new SIS system, partly because they felt that the current generation of Software-as-a-Service (SaaS) systems was not mature enough to meet institutional needs and that it is too soon to know whether the emerging generation of SaaS systems will fulfill the promise of better service and longer-lasting technology.

In the meantime, the CSU needed to address the demands for digital transformation and improved student outcomes. Was there a way to rethink the existing system while also preparing for the time when the old technology will no longer be tenable? We started by considering the ways in which the legacy SIS was inadequate or suboptimal. We identified a number of common frustrations and are embarking on strategies to address most of them. But perhaps most critical was the issue of integration. Integrating legacy systems, which are designed to work as monoliths, with modern SaaS systems in use in other campus areas can be difficult and costly. With this motivation we developed a strategy to mitigate this problem. Although there are some special circumstances at the CSU, our overall strategy and tactics can be emulated elsewhere, including at many smaller institutions. In fact, if these strategies can work at the CSU, they can work almost anywhere else.

Let's start by considering the ways in which legacy systems are found to be inadequate or suboptimal. Often, the legacy SIS does a decent job of processing transactions; the CSU's PeopleSoft environment handles millions of transactions each year with few technical problems. But common frustrations include the following six issues:

  1. Integration: Legacy systems, designed to work as monoliths, are difficult and costly to integrate with modern SaaS solutions.
  2. UX: Dated user experience (UX) design doesn't meet students' expectations and hinders staff and faculty.
  3. Data: Complex relational databases make it difficult and slow to obtain desired data.
  4. Scalability: Older SISs were not designed to scale dynamically using virtual or cloud resources.
  5. Outdated Models: Assumptions about academic structures—for example, traditional design of terms and degrees—may not match the needs of a modern institution.
  6. Product End-of-Life: Vendors eventually stop providing software updates, creating security and operational risks.

The CSU has embarked on strategies to address most of these issues. In this article, I will explore how the development of an API-based integration layer can significantly mitigate the first issue—the challenge of integrating the SIS with other systems—as well as create new options for updating the UX. In addition, I will present the strategies that the CSU has adopted to mitigate the challenges of data access and scalability.

Integration

Integration refers to the connections between the SIS and other campus systems, including learning management systems, human resource systems, library services, parking and dining systems, and system directories such as Microsoft Active Directory. Traditionally, these integrations have been built one of two ways—through the export and import of "flat files," consisting of lines of data separated by a delimiter such as a comma, or through the modification of the SIS code to communicate directly with the outside system.

The flat file, or "batch," approach to integration is relatively easy to implement, although over time, managing these files can incur significant overhead and add new security and reliability risks. In addition, feeds typically operate on a daily schedule, meaning that a change in one system won't reach a downstream system until the next day, resulting in poor student service. For example, a student might return a library book on Tuesday afternoon to clear a hold, but the hold might not be lifted until the information reaches the SIS on Wednesday morning. Adding more frequent updates may be an option but may also be difficult or costly. Student services operating on a 24- or 48-hour schedule don't meet the expectations of today's students or their families.

Modifying the underlying code of the SIS may address the need for timeliness. But it is far more costly in both the short and the long term. Modifications to an SIS need to be carefully tested and sometimes updated every time the SIS code is updated by the vendor. As the number of modifications grows over the years, the update process becomes progressively more complex, accumulating technical debt along the way.

An alternative strategy is to create an intermediate data repository, which can consist of a simple set of files, often called a "web view," or a more complex database that may do double-duty as a data warehouse. While this strategy can provide a partial solution, it falls short of current requirements in two ways. First, these systems typically are updated only periodically, so the information is almost always stale and thus cannot respond to users' needs in real time. Second, a data repository is a one-way strategy—systems can access SIS data, but they can't update it.

Modern software is built using a different model, in which individual programs can communicate in real time, or with a small delay ("near real time"), using a well-defined set of interfaces called an Application Programming Interface (API). The most commonly used approach to APIs, called RESTful, provides a simple model of application interaction consisting mostly of GETs and PUTs. For example, to find out whether a library book with a particular identifier is currently checked out, a user might send a GET Library-Book-Status (ID) message to the library management system and receive a message "On Loan," indicating that the book is checked out. By composing GET and PUT messages, two systems can communicate without requiring flat files or code modification.

What if we could do the same with the legacy SIS? Some colleges and universities have done so, implementing software systems that can translate the data and business functions of the SIS into these RESTful APIs. Brigham Young University has made notable strides in the area, for example.1 And a number of other institutions have also created APIs. A list maintained by Kin Lane identifies nearly 50 colleges and universities around the world with API initiatives, many including SIS functionality.2

While individual institutions have developed their own APIs, a standard API model would simplify integration with third party software. IMS Global Learning Consortium has initiated the development of EDU-API, intended to provide a standard API for accessing higher education SISs, not just in the United States but internationally. The payoff for success in reducing interface costs for institutions, and for vendors, would be significant.

There are two other advantages of an API-based integration strategy. First, implementing integrations via API can be accomplished much more quickly than traditional development. Not only are well-designed APIs simple and easy to use, they also isolate the add-on development from the SIS, reducing the risk that an integration will cause a performance problem or will damage the integrity of the SIS. Second, API-based programs can give veteran software developers the opportunity to work with modern tools, and by eliminating the large amounts of specific knowledge needed to work in an SIS, they multiply the number of individuals who can develop campus software. This has the advantage of making the IT organization less of a bottleneck and encouraging a wide range of innovation in applications and interfaces.

The CSU is in the early phase of developing an API that will simplify integration with our PeopleSoft system across 23 campuses by 2021, eventually extending beyond IT departments to include staff, faculty, and even students who want to develop innovative interfaces. By developing an SIS API, we will greatly simplify SIS interaction and abstract it from the underlying software; eventually, when most or all interfaces are via API, we will be able to change the underlying SIS while minimizing the impact on these integrations, thus reducing the cost of a future SIS replacement.

An important positive impact that we hope to realize via our integration strategy is the modernization of our approach to software development. Traditional, legacy software development, while it has been successful in many ways, is slow and inflexible, requires a large investment of time and effort, and often cannot accommodate the expected rate of change in a modern institution. Developing more agile approaches may be possible, but this is a difficult transition with the tools available for a legacy system.

Our integration layer will give us the opportunity to train our experienced software staff to use the tools of modern software development. By isolating the core of the system and innovating around the edges, our developers will be freed up to use agile methods and "fail fast," a concept that really doesn't apply when you're modifying the core engine that drives the business processes of the entire institution. In addition, we create opportunities for a new generation of developers, already accustomed to using APIs, to bring their innovative approaches into our environment. We can even benefit from the efforts of the many students already familiar with working with modern tools to create mobile and web applications, and we can crowdsource new development where appropriate. By the time we are ready to replace the legacy system, we will already be far down the path toward the methods and patterns used in a SaaS-based environment.

We are also exploring the possibility that with the new flexibility offered by the integration layer, we can begin to "unwind" some of the custom development that has been added over the years, potentially reducing our technical debt in a dramatic way. This may mean that we can merge divergent development paths taken over time by multiple campuses and share a single code base among these campuses, further reducing downstream costs. The goal is to use the legacy system for what it does well—and for nothing else.

UX

Poor and outdated user interfaces represent another motivator for replacing the SIS. Based on legacy models of a text-based computer terminal, SISs have been updated to operate on the web and are just now beginning to provide adequate interfaces on mobile devices. Because of the computer terminal (aka "green screen") legacy, users often have to navigate through a long chain of nonintuitive "screens" and complex menus, clicking from one place to another to complete even a simple transaction. Rarely does the SIS modify its view based on an understanding of user context; instead it presents links and menus for options that may have no use and may not even by available to a particular user. Students (and others) find these screens inscrutable, slow, and hard to use. The idea that a student should have to be trained in the use of the campus system in order to register for a class is both impractical, from the campus point of view, and incomprehensible, from the perspective of a student who has grown up on the internet and smartphones.

While newer SISs generally offer a much better user experience, they still typically suffer from their design as a "silo" within the campus infrastructure. Students will register for classes perhaps 15 or 20 times over their 4 or 5 years. Why would they want to "go to" a special system for this purpose rather than having the registration process integrated with other campus functions they may access more frequently? Many campuses have developed one or more campus apps, typically designed in a "mobile-first" paradigm; with this approach, it makes more sense to integrate key SIS functionality into the mobile app or portal. And the most natural way to create such an integration is through an API.

Hence, creating a comprehensive SIS API supports not only integration in general, but integration in the development of an improved student (and faculty and staff) user experience. Arguably, allowing access to the SIS via an API is more flexible and "future proof" than choosing a new SIS, which will have embedded within it a model of user interaction that may or may not be consistent with the overall user experience a campus is creating for its constituents. Thus the power of the SIS API is that it supports both internal integrations and improved user experience. Note that interaction requires retrieving information from, as well as sending information to, the SIS, so that APIs need to include both GET and PUT capability, and they generally need to be real-time or near real-time.

Data

Users of the traditional SIS have often been frustrated by the difficulty they encounter in accessing data for reporting, analytical, and predictive purposes. Campus constituents generally have the sense that an SIS contains a wealth of data but that they can't get to it. SIS relational table structures are highly complex and difficult to understand, and poorly designed queries can cause a significant performance hit on an SIS, potentially impacting end users.

Many campuses have addressed this limitation by creating data warehouses, an approach that has led to some successes, to many more partial successes, and sometimes to outright failures. Even a well-designed and implemented data warehouse is expensive to build and operate and tends to lack the flexibility needed over time to answer new questions that were not anticipated when the original data warehouse was designed.

As Vince Kellen explains,3 traditional strategies for data warehouses were based on a model of scarcity—scarcity in computational horsepower to perform Extract, Transform, and Load (ETL); scarcity in disk storage; and scarcity in processing to organize and analyze the data. Data warehousing tools force the enterprise to anticipate in advance the kinds of questions that end users are likely to ask, which is unsatisfying in a couple of ways. First, understanding what those questions are likely to be, and then building the models for the data warehouse, is a slow and expensive process. Second, when an unanticipated question arises that can't be answered by the existing data warehouse, another slow and expensive process is typically required to modify the environment for the new type of query. Because of the inflexibility and costs of the data warehouse model, some campuses despair at ever having the timely access they need to develop a culture of data-driven decision-making.

At the CSU we're developing a newer and more flexible approach, similar to what Kellen proposes. A data lake in an Amazon Web Services (AWS) cloud keeps a complete copy of the entire set of updates over a 24-hour period, going back for an entire year. This comprehensive data store can be used with virtual query tools, such as AWS Redshift and others, to allow users to construct queries to any question that the data can potentially provide an answer for, without requiring data warehouse designers to know in advance what these questions will be. Modern public cloud tools enable this approach at a large scale and at a manageable cost. At the CSU, we're optimistic that this new approach of data abundance, rather than data scarcity, will help reduce the frustration over accessing the data in our legacy system.

Scalability

Scalability refers to the ability to adjust the resources of a system dynamically based on demand. This is one of the cardinal features of cloud computing, and it fits the usage of an SIS perfectly. The sizing of an SIS is typically based on peak demand, which may be the first day of classes or the last day to make schedule changes. If an institution's system contains enough resources to manage all the peaks, then most of those resources sit idle during other times. Cloud computing offers a pay-as-you-go model, potentially offering lower costs and better performance under peak loads.

Because legacy systems were designed before the cloud was developed, they typically were not architected to take advantage of the affordances of the cloud. Simply moving a legacy system to cloud hosting may serve some purposes, but doing so will often cost more than a traditional model and won't necessarily provide dynamic scaling. The CSU has chosen a hybrid approach; by moving from a traditional, hosted data center model to a cloud hybrid data center, we have begun to enable certain types of scaling in our environment.

First, the three-tier architecture model used by most legacy systems contains elements that can take advantage of the cloud without major modification. While the database layer typically operates in a static environment, web servers provide the front end, and these can become cloud-based. In our environment, we are developing a hybrid in which those elements that lend themselves to cloud containers can be expanded up and down dynamically as demand shifts.

In addition, we are adding the dynamic provisioning of test-and-development instances in the cloud. This feature is supported by database virtualization (see, e.g., Delphix), which enables a small portion of the core database to be replicated in the cloud while most of the data "stays home" in the hosted environment. This will accelerate system development and testing by allowing for new development and test instances "on demand" in the cloud without requiring the resource management needed in the past.

In Conclusion

The last two issues noted above—Outdated Models and Product End-of-Life—are somewhat less amenable to the strategies described above. To the extent that higher education moves away from traditional models of terms, courses, and credits, the assumptions embedded in a 1990s model of higher education will become progressively obsolete and will not fit the evolving business model of our institutions. However, based on current trends, we don't see this change coming to our institutions in a substantial way in the near term. Product end-of-life could make the existing system too expensive and risky to maintain, so it's a real threat; but at this time we expect to have close to another decade before we hit this wall.

For many higher education institutions, replacing an SIS may become inevitable in the long run. In the meantime, using strategies to extend its life while eliminating many of its weaknesses can represent good stewardship of institutional resources.

Notes

  1. David Raths, "Building APIs for the University and the Student," Campus Technology, April 13, 2017.
  2. Kin Lane, "Why Am I Looking at Universities and APIs," workshop [n.d.], API Evangelist, accessed July 1, 2019.
  3. Vince Kellen, "21st-Century Analytics: New Technologies and New Rules," EDUCAUSE Review 52, no. 2 (Spring 2019).

Michael Berman is Chief Innovation Officer and Deputy CIO at California State University, Chancellor's Office. He is the 2019 Editor of the New Horizons column for EDUCAUSE Review.

© 2019 Michael Berman. The text of this article is licensed under the Creative Commons Attribution 4.0 International License.

EDUCAUSE Review 54, no. 3 (Summer 2019)