07
October
2025

20 Years of perfSONAR: How a Community-Built Toolkit Transformed Network Performance Monitoring for R&E

Subscribe for more like this

Share

By Amber Rasche - Communications Manager, Internet2

Estimated reading time: 7 minutes

In two decades, perfSONAR has become a cornerstone of performance monitoring and troubleshooting for research and education (R&E) networks. The open-source toolkit plays an essential role in ensuring data flows smoothly and science moves forward.

Earlier this year, the perfSONAR development collaboration — ESnet, GÉANT, Indiana University, the University of Michigan, RNP, and Internet2 — announced the 20th anniversary of perfSONAR. To reflect more on the journey, we asked Mark Feit, measurement and telemetry systems architect at Internet2 and longtime perfSONAR contributor, to share insights on its major milestones, community impact, and what lies ahead.

This fall, the R&E community will have more opportunities to see perfSONAR in action and connect directly with the experts who develop it. perfSONAR will once again be part of SCinet at SC25 in St. Louis in November, continuing a tradition that began with its debut in 2005. The following month, the Internet2 Technology Exchange in Denver will include a full-day workshop and a technical talk on the latest developments and future direction of the project.

perfsonar logo
Mark Feit uses perfSONAR to troubleshoot and resolve a network issue during SC.
Mark Feit (center) uses perfSONAR to troubleshoot and resolve a network issue during SC.
Looking back over the past two decades, what do you see as the most significant milestones in perfSONAR’s evolution?


Mark: Three milestones stand out. First is perfSONAR coming into existence. The idea that you could ask systems across multiple organizations — often those you don’t control — to participate in a measurement was novel 20 years ago. It’s still rare outside of R&E.

Second is the establishment of the development collaboration. Having multiple organizations commit full-time, professional staff to the project gives it long-term support and a lot of accumulated institutional wisdom about how perfSONAR works and how best to apply it.

Third is the expansion of what perfSONAR could do that began in 2017 with the release of 4.0. The modular architecture of pScheduler has enabled the ability to perform more types of measurements (approximately two dozen) with more tools (over 40) and more ways to send them into other systems for storage or further processing (about a dozen). The API that came with it made perfSONAR into a measurement building block for other systems, making it more flexible and extensible. We’re now at version 5.2.2 and continue to reap the benefits of that shift.


How has perfSONAR shaped the way R&E networks think about and approach performance monitoring and troubleshooting?


Mark: In many ways, it goes the other way around: R&E’s approach has shaped perfSONAR. At my first Internet2 Technology Exchange in 2015, I heard a lot from our members about what they wanted. The development team went on to check off almost every box on that wishlist.

R&E networks differ from those in other sectors because they’re operated by multiple organizations alone and in concert. The open collection and sharing of performance data across networks is a side effect of everyone trying to achieve the singular goal of making them valuable tools for moving scientific data quickly and efficiently.

Back to your original question, perfSONAR has influenced R&E’s thinking in its by-the-community, for-the-community approach. Network operators have the freedom to deploy it widely without worrying about the cost beyond the hardware — and even that can come off the surplus pile. Our organizational structure has even been used as a template for other community-led projects.


How is the implementation of perfSONAR on Internet2’s network benefiting the community?


Mark: perfSONAR’s mantra has always been “the more, the merrier,” because more measurement points mean a finer-grained view of exactly where a problem is occurring. Historically, Internet2 had perfSONAR nodes at every point of presence, with many supporting internal visibility and operational needs and a few open to the community. With the launch of the Next Generation Infrastructure in 2021, open-to-the-community perfSONAR now exists everywhere on the Internet2 network, giving members end-to-end visibility from their devices to the nearest edge of the Internet2 backbone to the points handling their data as it travels.


Can you share a story or example where perfSONAR made a real difference?


Mark: One of my favorite examples happened during SC18, where I ran into a network engineer from a small university in New England who was struggling with dropped packets that disrupted measurements with other institutions. He suspected one specific device, but didn’t have a good way to prove it. 

We talked a bit about his test infrastructure and found two of his perfSONAR nodes that were well-placed to narrow it down. We ran some ad hoc tests from my laptop, which we perched atop the only surface we could find, a stray stool. The first ones replicated the problem, and we did some follow-ons to see where it stopped. It turned out that the suspect device was dropping otherwise-valid ICMP packets above a certain size but smaller than the MTUs on the interfaces. The device vendor confirmed the bug, issued a patch, and the problem was solved.


How is perfSONAR preparing for the emerging challenges and opportunities facing the R&E community?


Mark: The development team has spent the last decade changing nearly every corner of perfSONAR from something with limited capabilities to something we can expand easily. The current architecture doesn’t do absolutely everything we or anyone else in the community has dreamed up, but it has handled nearly all of the curveballs thrown at it. That flexibility gives me a lot of confidence that, as R&E comes up with new things to measure or ways to report the results, we’ll be able to integrate them.


Looking to the next perfSONAR release, what new capabilities can the user community expect?


Mark: The theme for version 5.x, which has been underway for two years, has been overhauling storage and visualization. A lot of that software was developed in-house, which was the right thing to do at the time, but we’ve been able to build on well-understood, open-source software like LogStash, OpenSearch, and Grafana. That puts a lot of power to analyze and display results in the hands of the community, who can do that on their own terms.

pSConfig, our program for centrally managing fleets of perfSONAR nodes, was overhauled in an earlier release and will continue to be part of the ecosystem. Its companion tool, the pSConfig Web Administrator, which allows building up mesh configurations without having to handwrite JSON, will see a significant upgrade in a future release. The new tool, called pSCompose, will interface directly with pScheduler, providing immediate access to new tests, tools, and archivers.


Imagine perfSONAR 10 years from now – what will it be doing that it doesn’t do today?


Mark: perfSONAR primarily resides at Layer 3 because that’s where science data lives. As boring as it may sound, I expect the emphasis to remain there for quite some time. perfSONAR’s architecture will allow integration with new tools to measure new protocols as they evolve.

What I think will be most interesting is less about new protocols and tools and more about unusual applications. For example, I’ve been thinking about ways to use perfSONAR as a unit-testing framework to verify that network security policies have been properly implemented, maybe even cooperatively between institutions. Doing this as part of a regular testing regime would shine a light on problems that might have remained hidden until an incident happened.


Is there anything else happening with perfSONAR this year that the community should be aware of?


Mark: perfSONAR will once again be part of SCinet at SC25, Nov. 16-21 in St. Louis, its 20th appearance since the prototype debut in 2005. Like the rest of SCinet, the deployment there has become a testing ground for new ways to run it. Much of what we do there has pollinated what we do at Internet2 and what the project recommends to perfSONAR administrators.

The 2025 Technology Exchange, Dec. 8-12 in Denver, will offer a full-day perfSONAR workshop on Dec. 8. It will cover everything from installation to using pScheduler’s command-line interface to customizing Grafana dashboards. The development team will be giving a talk about the current and future state of the project in the Advanced Networking track on Dec. 10.

We have a few other things in the pipeline that aren’t ready to be shared yet, so keep your ears open.

ICYMI