SREcon25 Americas is a conference dedicated to technical advances and trends in Site Reliability Engineering (SRE).
Topics
- Complex Systems in Action
- All things incidents, including but not limited to incident management and coordination, analyzing incidents and near misses, and preparing through activities such as game days, tabletop exercises, and drills
- Unplanned capacity constraints, scale, and latency
- How teams and organizations learn from the unexpected and use information from incidents and other unexpected events, including learnings or systemic changes implemented as a result of learning from others` outages
- Experiences with chaos engineering and driving adoption across teams
- Systems Engineering/Principles
- Databases (e.g., how is data stored on disk in MySQL, PostgreSQL performance, consistency models)
- Performance (e.g., OS features, hardware design, bottlenecks)
- Distributed Systems (e.g., consistency and consensus, Hadoop, MapReduce, Jupyter Notebooks, Containers)
- Observability (e.g., monitoring overview, events vs. metrics, visualizations, debugging, scaling your monitoring infrastructure)
- Firmware (e.g., Open Source Firmware, UEFI)
- Network (e.g., SD WAN, load balancing, DNS, IP protocols, layer 2 networks, BGP)
- Security (e.g., TPMs, Hardware Security Modules, transport encryption, filesystem encryption, data management)
- Supporting the Humans in the Systems
- Scaling work in a sustainable way, as well as topics around addressing and avoiding burnout
- Career development, mentoring/internships, and supporting diversity in SRE
- Creating psychological safety in teams and organizations, especially in high-pressure and high-stakes environments
Who should Attend
Engineers involved or interested in:
- Systems engineering
- Site reliability engineering (SRE)
- Complex distributed systems