Cascading Failure Modes in Model-as-a-Service Architectures : When Your Dependencies Think
DOI:
https://doi.org/10.32628/IJSRCE237530Keywords:
Model-as-a-Service, cascading failures, machine learning reliability, resilient architectures, circuit breakers, fallback strategies, graceful degradation, service orchestration, decision uncertainty, infrastructure risk, AI system dependability, intelligent dependenciesAbstract
Following an increased move towards machine learning becoming a runtime dependency of contemporary software systems, the Model-as-a-Service (MaaS) architecture is subjected to a new category of systemic risks where failure at one component propagates across decision-making when it comes to systems. In contrast to the failure of standard microservices, failure modes in services that rely on machine learning can silently propagate throughout workflows, multiplying uncertainty, reducing the quality of decisions, and destabilizing downstream services. In this study, the authors investigated the cascading failure modes of MaaS architectures and stated that resilience should be explicitly designed based on intelligent dependencies. Based on cascading failure theory, service-oriented architectures, and critical infrastructure systems, this study conceptualizes how a combination of data inconsistencies, variability in model behavior, orchestration misalignment, and the disruptive behavior of infrastructure amplify failure. This article proposes a resilience-based architectural viewpoint that revolves around circuit breakers, fallback plans, and graceful degradation systems that are specific to machine-learning-reliant systems. Circuit breakers are re-conceptualized as resilience to erratic inference behavior and delays in decision-making; fallback strategies are re-conceptualized as goal-preserving, which does not lead to dramatic service outages but gradual decay in decision faithfulness; and graceful degradation is re-conceptualized as a controlled decline in decision faithfulness as opposed to a sudden crash in the service
Downloads
References
Kim, H. S., & Lee, S. W. (2018). Dependability-enhanced unified modeling and simulation methodology for critical infrastructure . Information and Software Technology , 102 , 175-192. https://doi.org/10.1016/j.infsof.2018.06.002 DOI: https://doi.org/10.1016/j.infsof.2018.06.002
Souza, E., & Moreira, A. (2018, April). Deriving services from KAOS models. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing (pp. 1308-1315). https://doi.org/10.1145/3167132.3167273 DOI: https://doi.org/10.1145/3167132.3167273
Νούσιας, Ν. (2021). Business process and decision automation: end-to-end deployment with a BPMN and DMN-based workflow engine. http://dspace.lib.uom.gr/handle/2159/25185
James, T., & Hristozov, D. (2021). Deep learning and computational chemistry. In Artificial Intelligence in Drug Design (pp. 125-151). New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-1787-8_5 DOI: https://doi.org/10.1007/978-1-0716-1787-8_5
Koch, A. (2021). Proceedings of the 17TH International Congress on Circumpolar Health, August 12–15, 2018, Copenhagen, Denmark (ICCH17). International Journal of Circumpolar Health , 80 (sup1), 1893013. https://doi.org/10.1080/22423982.2021.1893013 DOI: https://doi.org/10.1080/22423982.2021.1893013
Xing, L. (2020). Cascading failures in Internet of Things: Review and perspectives on reliability and resilience. IEEE Internet of Things Journal , 8 (1), 44-64. https://doi.org/10.1109/JIOT.2020.3018687 DOI: https://doi.org/10.1109/JIOT.2020.3018687
Pappula, K. K. (2022). Architectural Evolution: Transitioning from Monoliths to Service-Oriented Systems. International Journal of Emerging Research in Engineering and Technology , 3 (4), 53-62. https://doi.org/10.63282/3050-922X.IJERET-V3I4P107 DOI: https://doi.org/10.63282/3050-922X.IJERET-V3I4P107
Pitakrat, T., Okanović, D., Van Hoorn, A., & Grunske, L. (2018). Hora , Architecture-aware online failure prediction , Journal of Systems and Software , 137 , 669-685. https://doi.org/10.1016/j.jss.2017.02.041 DOI: https://doi.org/10.1016/j.jss.2017.02.041
Karanjkar, R. (2022). Resiliency Testing in Cloud Infrastructure for Distributed Systems. International Journal of Research Publications in Engineering, Technology and Management (IJRPETM) , 5 (4), 7142-7144. https://doi.org/10.15662/IJRPETM.2022.0504007
Koneru, S. H., & Avireneni, T., Yelkoti, N. K. K. R., & Khaga, S. P. Y. (2021). Cloud-Native Microservices Architecture. International Journal of Emerging Trends in Computer Science and Information Technology , 2 (4), 86-94. https://doi.org/10.63282/3050-9246.IJETCSIT-V2I4P110 DOI: https://doi.org/10.63282/3050-9246.IJETCSIT-V2I4P110
Zhou, X., Peng, X., Xie, T., Sun, J., Ji, C., Li, W., & Ding, D. (2018). Fault analysis and debugging of microservice systems: Industrial survey, benchmark system, and empirical study. IEEE Transactions on Software Engineering , 47 (2), 243-260. https://doi.org/10.1109/TSE.2018.2887384 DOI: https://doi.org/10.1109/TSE.2018.2887384
Power, A., & Kotonya, G. (2018, June). A microservices architecture for reactive and proactive fault tolerance in iot systems. In 2018 IEEE 19th International Symposium on" A World of Wireless, Mobile and Multimedia Networks"(WoWMoM) (pp. 588-599). IEEE. https://doi.org/10.1109/WoWMoM.2018.8449789 DOI: https://doi.org/10.1109/WoWMoM.2018.8449789
Enjam, G. R., & Tekale, K. M. (2020). Transitioning from Monolith to Microservices in Policy Administration. International Journal of Emerging Research in Engineering and Technology , 1 (3), 45-52. https://doi.org/10.63282/3050-922X.IJERETV1I3P106 DOI: https://doi.org/10.63282/3050-922X.IJERETV1I3P106
Satyanarayanan, A. (2022). Foundational Framework Self-Healing Data Pipelines for AI Engineering: A Framework and Implementation. International Journal of Artificial Intelligence, Data Science, and Machine Learning , 3 (1), 63-76. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I1P107 DOI: https://doi.org/10.63282/3050-9262.IJAIDSML-V3I1P107
Brefort, D., Shields, C., Jansen, A. H., Duchateau, E., Pawling, R., Droste, K., ... & Kana, A. A. (2018). An architectural framework for distributed naval ship systems. Ocean Engineering , 147 , 375-385. https://doi.org/10.1016/j.oceaneng.2017.10.028 DOI: https://doi.org/10.1016/j.oceaneng.2017.10.028
Adjerid, I., Acquisti, A., & Loewenstein, G. (2019). Choice architecture, framing, and cascaded privacy choices. Management science , 65 (5), 2267-2290. https://doi.org/10.1287/mnsc.2018.3028 DOI: https://doi.org/10.1287/mnsc.2018.3028
Yarygina, T., & Bagge, A. H. (2018, March). Overcoming security challenges in microservice architectures. In 2018 IEEE Symposium on Service-Oriented System Engineering (SOSE) (pp. 11-20). IEEE. https://doi.org/10.1109/SOSE.2018.00011 DOI: https://doi.org/10.1109/SOSE.2018.00011
Ranjani, S. (2021). Design patterns for scalable microservices in banking and insurance systems: insights and innovations. International Journal of Emerging Research in Engineering and Technology , 2 (1), 17-26. https://doi.org/10.63282/3050-922X.IJERET-V2I1P103 DOI: https://doi.org/10.63282/3050-922X.IJERET-V2I1P103
Tariq, N., Asim, M., & Khan, F. A. (2019). Securing SCADA-based critical infrastructures: Challenges and open issues. Procedia computer science , 155 , 612-617. https://doi.org/10.1016/j.procs.2019.08.086 DOI: https://doi.org/10.1016/j.procs.2019.08.086
Pal, S., & Jadidi, Z. (2021). Analysis of security issues and countermeasures for the industrial internet of things. Applied Sciences , 11 (20), 9393. https://doi.org/10.3390/app11209393 DOI: https://doi.org/10.3390/app11209393
Zuccaro, G., De Gregorio, D., & Leone, M. F. (2018). Theoretical model for cascading effects analyses. International journal of disaster risk reduction , 30 , 199-215. https://doi.org/10.1016/j.ijdrr.2018.04.019 DOI: https://doi.org/10.1016/j.ijdrr.2018.04.019
Bandur, V., Selim, G., Pantelic, V., & Lawford, M. (2021). Making the case for centralized automotive E/E architectures. IEEE Transactions on Vehicular Technology , 70 (2), 1230-1245. https://doi.org/10.1109/TVT.2021.3054934 DOI: https://doi.org/10.1109/TVT.2021.3054934
Nguyen, T. N., Liu, B. H., Nguyen, N. P., Dumba, B., & Chou, J. T. (2021). Smart grid vulnerability and defense analysis under cascading failure attacks. IEEE Transactions on Power Delivery , 36 (4), 2264-2273. https://doi.org/10.1109/TPWRD.2021.3061358 DOI: https://doi.org/10.1109/TPWRD.2021.3061358
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
https://creativecommons.org/licenses/by/4.0