domingo, agosto 07, 2022

Todd Montgomery: Unblocked by design


Leído en InfoQ , que publica una presentacion ofrecida en QCon Plus, en noviembre de 2021. Un punto de vista lejano a cómo he trabajado siempre, pero con argumentos para atenderlo. Todd Montgomery aboga en favor del diseño asincrónico de los procesos. considerando en primer lugar que la secuencialidad es ilusoria:

All of our systems provide this illusion of sequentiality, this program order of operation that we really hang our hat on as developers. We look at this and we can simplify our lives by this illusion, but be prepared, it is an illusion. That's because a compiler can reorder, runtimes can reorder, CPUs can reorder. Everything is happening in parallel, not just concurrently, but in parallel on all different parts of a system, operating systems as well as other things. It may not be the fastest way to just do step one, step two, step three. It may be faster to do steps one and two at the same time or to do step two before one because of other things that can be optimized. By imposing order on that we can make some assumptions about the state of things as we move along. Ordering has to be imposed. This is done by things in the CPU such as the load/store buffers, providing you with this ability to go ahead and store things to memory, or to load them asynchronously. Our CPUs are all asynchronous.

Storages are exactly the same way, different levels of caching give us this ability for multiple things to be optimized along that path. OSs with virtual memory and caches do the same thing. Even our libraries do this with the ideas of promises and futures. The key is to wait. All of this provides us with this illusion that it's ok to wait. It can be, but that can also have a price, because the operating system can de-schedule. When you're waiting for something, and you're not doing any other work, the operating system is going to take your time slice. It's also lost opportunity to do work that is not reliant on what you're waiting for. In some application, that's perfectly fine, in others it's not. By having locks and signaling in that path, they do not come for free, they do impose some constraints.

 Ubicando el contexto primero: 

When we talk about sequential or synchronous or blocking, we're talking about the idea that you do some operation. You cannot continue to do things until something has finished or things like that. This is more exaggerated when you go across an asynchronous binary boundary. It could be a network. It could be sending data from one thread to another thread, or a number of different things. A lot of these things make it more obvious, as opposed to asynchronous or non-blocking types of designs where you do something and then you go off and do something else. Then you come back and can process the result or the response, or something like that.

Cómo ve la sincronía:

I'll just use as an example throughout this, because it's easy to talk about, the idea of a request and a response. With sync or synchronous, you would send a request, there'll be some processing of it. Optionally, you might have a response. Even if the response is simply just to acknowledge that it has completed. It doesn't always have to involve having a response, but there might be some blocking operation that happens until it is completed. A normal function call is normally like this. If it's sequential operation, and there's not really anything else to do at that time, that's perfectly fine. If there are other things that need to be done now, or it needs to be done on something else, that's a lost opportunity.

Y la asincronía:

Async is more about the idea of initiating an operation, having some processing of it, and you're waiting then for a response. This could be across threads, cores, nodes, storage, all kinds of different things where there is this opportunity to do things while you're waiting for the next step, or that to complete or something like that. The idea of async is really, what do you do while waiting? It's a very big part of this. Just as an aside, when we talk about event driven, we're talking about actually the idea of on the processing side, you will see a request come in. We'll denote that as OnRequest. On the requesting side, when a response comes in, you would have OnResponse, or OnComplete, or something like that. We'll use these terms a couple times throughout this.

 El propósito de Montgomery es procesar asincronicamente, y sacar partido de los tiempos muertos:

The key here is while something is processing or you're waiting, is to do something, and that's one of the takeaways I want you to think of. It's a lost opportunity. What can you do while waiting and make that more efficient? The short answer is, while waiting, do other work. Having the ability to actually do other stuff is great. The first thing is sending more requests, as we saw. The sequence here is, how do you distinguish between the requests? The relationship here is you have to correlate them. You have to be able to basically identify each individual request and individual response. That correlation gives rise to having things which are a little bit more interesting. The ordering of them starts to become very relevant. You need to figure out things like how to handle things that are not in order. You can reorder them. You're just really looking at the relationship between a request and a response and matching them up. It can be reordered in any way you want, to make things simple. It does provide an interesting question of, what happens if you get something that you can't make sense of. Is it invalid? Do you drop it? Do you ignore it? In this case, you've sent request 0, and you've got a response for 1. In this point, you're not sure exactly what the response for 1 is. That's handling the unexpected.

(...) This is an async duty cycle. This looks like a lot of the duty cycles that I have written, and I've seen written and helped write, which is, you're basically sitting in a loop while you're running. You usually have some mechanism to terminate it. You usually poll inputs. By polling, I definitely mean going to see if there's anything to do, and if not, you simply return and go to the next step. You poll if there's input. You check timeouts. You process pending actions. The more complicated work is less in the polling of the inputs and handling them, it's more in the checking for timeouts, processing pending actions, those types of things. Those are a little bit more complex. Then at the end, you might idle waiting for something to do. Or you might just say, ok, I'm going to sleep for a millisecond, and you come right back. You do have a little bit of flexibility here in terms of idling, waiting for something to do.

 Realmente, estos conceptos parecen complicados de aplicar en un proceso usual de trabajo, y más viables en la construcción de trabajos de nivel de sistema operativo. El interlocutor de Montgomery (Printezis) lo ve justamente así: You did talk about the duty cycle and how you would write it. In reality, how much a developer would actually write that, but instead use a framework that will do most of the work for them?

La respuesta de Montgomery:

(...) Beyond that, I mean, patterns and antipatterns, I think, learning queuing theory, which may sound intimidating, but it's not. Most of it is fairly easy to absorb at a high enough level that you can see far enough to help systems. It is one of those things that I think pays for itself. Just like learning basic data structures, we should teach a little bit more about queuing theory and things behind it. Getting an intuition for how queues work and some of the theory behind them goes a huge way, when looking at real life systems. At least it has for me, but I do encourage people to look at that. Beyond that, technologies frameworks, I think by spending your time more looking at what is behind a framework. In other words, the concepts, you do much better than just looking at how to use a framework. That may be front and center, because that's what you want to do, but go deeper. Go deeper into, what is it built on? Why does it work this way? Why doesn't it work this other way? Asking those questions, I think you'll learn a tremendous amount. (...)

La conversación se extiende y deriva por otros asuntos relacionados. Recomendable para leer y releer. Habrá que volver más de una vez.

Veo un modo de afrontar los procesos alejado del modo en que usualmente he trabajado, pero debo reconocer que en los últimos cinco o seis años los cambios conceptuales sobreabundan, y puedo decir que estamos en una quinta o sexta generación, lejos de aquellos que llamamos cuarta generación hace veinte o treinta años. El tiempo mostrará qué ha resultado duradero, y qué ha tomado por un callejón sin salida. Estoy dispuesto a escuchar.

 


Pesadillas en la nube

 Forrest Brazeal, actualmente empleado de Google Cloud (An AWS Hero turned Google Cloud employee, I explore the technical and philosophical differences between the two platforms. My biases are obvious, but opinions are my own) señala en julio que la peor pesadilla de cualquier desarrollador en la nube es una llamada recursiva en sus pruebas, que escale la facturación de su cuenta de unos pocos dólares/euros a "miles" (50.000 por ejemplo). Y una llamada recursiva que genere miles de llamadas procesadas puede producirse en cualquier prueba:

AWS calls it the recursive runaway problem. I call it the Hall of Infinite Functions - imagine a roomful of mirrors reflecting an endless row of Lambda invocations. It’s pretty much the only cloud billing scenario that gives me nightmares as a developer, for two reasons:

  • It can happen so fast. It’s the flash flood of cloud disasters. This is not like forgetting about a GPU instance and incurring a few dollars per hour in linearly increasing cost. You can go to bed with a $5 monthly bill and wake up with a $50,000 bill - all before your budget alerts have a chance to fire.

  • There’s no good way to protect against it. None of the cloud providers has built mechanisms to fully insulate developers from this risk yet.

Brazeal apunta a un incidente descripto en detalle por sus propias víctimas (We Burnt $72K testing Firebase + Cloud Run and almost went Bankrupt) que puede dar una idea del problema. En este caso la factura pasó de un potencial de 7 dólares a 72000...

Sudeep Chauhan, protagonista de este incidente, escribe posteriormente, tras poner en orden la casa, una lista de recomendaciones para trabajar con un proveedor de servicios en la nube.

Nota: Renato Losio, en InfoQ, a propósito del artículo de Brazeal, lo menciona y extiende, recordando otro artículo de Brazeal dedicado a la capa sin cargo de AWS.


sábado, agosto 06, 2022

Probablemente usted no necesite microservicios

 Mattew Spence, en ITNEXT, a contracorriente de la enorme ola de bombo sobre microservicios, desarrolla un consistente conjunto de argumentos de relativización de la importancia y necesidad de microservicios (You don't need microservices) . Sólo destaco el argumento acerca de la simplicidad de los microservicios, y de sus ventajas derivadas:

"Simpler, Easier to Understand Code"

This benefit is at best disingenuous, at worse, a bald-faced lie.

Each service is simpler and easier to understand. Sure. The system as a whole is far more complex and harder to understand. You haven’t removed the complexity; you’ve increased it and then transplanted it somewhere else.

(...) Although microservices enforce modularization, there is no guarantee it is good modularization. Microservices can easily become a tightly coupled “distributed monolith” if the design isn’t fully considered.

(...) The choice between monolith and microservices is often presented as two mutually exclusive modes of thought. Old school vs. new school. Right or wrong. One or the other.

The truth is they are both valid approaches with different trade-offs. The correct choice is highly context-specific and must include a broad range of considerations.

The choice itself is a false dichotomy and, in certain circumstances, should be made on a feature-by-feature basis rather than a single approach for an entire organization’s engineering team.

Should you consider microservices?

As is often the case, it depends. You might genuinely benefit from a microservices architecture.

There are certainly situations where they can pay their dues, but if you are a small to medium-sized team or an early-stage project:

No, you probably don’t need microservices.