domingo, diciembre 11, 2022

Galáctica y las dificultades de los modelos de lenguaje

En noviembre, Meta presentó un modelo de lenguaje bautizado Galactica, elaborado para asistir a investigadores científicos, pero sólo tres días después fue retirado de disponibilidad para ser consultado o testeado. Básicamente, como ha sucedido en otros campos de trabajo con inteligencia artificial (IA/AI), el lenguaje no reconoce verdad o falsedad. En las pruebas, trabajos formalmente presentados como científicos pero absurdos como la existencia de osos en el espacio, o las causas de la guerra de Ucrania, pasaron por buenos, con justificaciones razonadas.

Will Douglas Heaven, en Technology Review:

Galactica is a large language model for science, trained on 48 million examples of scientific articles, websites, textbooks, lecture notes, and encyclopedias. Meta promoted its model as a shortcut for researchers and students. In the company’s words, Galactica “can summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more.”

(...) A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood, a basic requirement for a language model designed to generate scientific text. People found that it made up fake papers (sometimes attributing them to real authors), and generated wiki articles about the history of bears in space as readily as ones about protein complexes and the speed of light. It’s easy to spot fiction when it involves space bears, but harder with a subject users may not know much about.

(...) Many scientists pushed back hard. Michael Black, director at the Max Planck Institute for Intelligent Systems in Germany, who works on deep learning, tweeted: “In all cases, it was wrong or biased but sounded right and authoritative. I think it’s dangerous.”

(...) The Meta team behind Galactica argues that language models are better than search engines. “We believe this will be the next interface for how humans access scientific knowledge,” the researchers write.  This is because language models can “potentially store, combine, and reason about” information. But that “potentially” is crucial. It’s a coded admission that language models cannot yet do all these things. And they may never be able to. “Language models are not really knowledgeable beyond their ability to capture patterns of strings of words and spit them out in a probabilistic manner,” says [Chirag Shah,  University of Washington]. “It gives a false sense of intelligence.”

 Grady Booch comenta: "Galactica is little more than statistical nonsense at scale. Amusing. Dangerous. And IMHO unethical". Algún investigador en ML (Yann LeCun, en el mismo hilo), se escandaliza por la calificación de no ético. Creo que a algunos científicos les falta medir el alcance de lo que tienen entre manos.

 

 

sábado, diciembre 10, 2022

Frederick Brooks: muere un pionero

 

Hace pocos días, el 17 de noviembre, ha muerto Frederick Brooks, un pionero de la ingeniería de software, casi de su primera generación. Longevo, continuó trabajando vinculado a las tecnologías digitales hasta la primera década de este siglo, comenzando desde 1953, después de egresar de la Universidad de Duke. Pasó por IBM a partir de 1956 y hasta 1965, donde dirigió el diseño de los ordenadores 360 (IBM System/360), el primer mainframe de IBM, base de la arquitectura estructurada por IBM, y padre directo de los 4300 y los actuales System Z. Todavía hoy una aplicación codificada en y para el 360 puede ejecutarse en un System/Z. En las decisiones que permitieron esta evolución, uno de los pilares fue Brooks. 

El otro gran aporte de Brooks está en la metodología, en la sistematización de su experiencia en sus años de IBM, en primer lugar, en 1975, con The Mythical Man-Month, y años después, en 1986, con No Silver Bullet—Essence and Accident in Software Engineering, agregado luego como nuevo capítulo en The Mythical...Existe un gran salto entre las épocas en que escribió estos libros y su lectura actual, pero a pesar del desfase técnico, todavía deben ser libros de lectura obligatoria. 

Lo que sigue es el obituario de Shane Hastie en InfoQ, con un buen conjunto de referencias a los logros de Brooks:

Dr Frederick P Brooks Jr, originator of the term architecture in computing, author of one of the first books to examine the nature of computer programming from a sociotechnical perspective, architect of the IBM 360 series of computers, university professor and person responsible for the 8-bit byte died on 17 November at his home in Chapel Hill, N.C. Dr Brooks was 91 years old.

He was a pioneer of computer architecture, highly influential through his practical work and publications including The Mythical Man Month, The Design of Design and his paper No Silver Bullet which debunked many of the myths of software engineering.

In 1999 he was awarded a Turing Award for landmark contributions to computer architecture, operating systems, and software engineering. In the award overview it is pointed out that

Brooks coined the term computer architecture to mean the structure and behavior of computer processors and associated devices, as separate from the details of any particular hardware implementation

In the No Silver Bullet article he states:

There is no single development, in either technology or management technique, which by itself promises even one order-of-magnitude improvement within a decade in productivity, in reliability, in simplicity.

Quotations from the Mythical Man Month:Essays on Software Engineering permeate software engineering today, including:

  • Adding manpower to a late software project makes it later.  
  • The bearing of a child takes nine months, no matter how many women are assigned.
  • All programmers are optimists.

On April 29, 2010 Dilbert explored the adding manpower quote.  

In 2010 he was interviewed by Wired magazine. When asked about his greatest technical achievement he responded

The most important single decision I ever made was to change the IBM 360 series from a 6-bit byte to an 8-bit byte, thereby enabling the use of lowercase letters. That change propagated everywhere.

He was the founder of the Computer Science Department at the University of North Carolina at Chapel Hill, where the Computer Science building is named after him. In an obituary the University says:

Dr. Brooks has left an unmistakable mark on the computer science department and on his profession; this is physically recognized by the south portion of the department’s building complex bearing his name. He set an example of excellence in both scholarship and teaching, with a constant focus on the people of the department, treating everyone with respect and appreciation. His legacy will live on at UNC-Chapel Hill

His page on the university website lists his honours, books and publications.

The Computer History Museum has an interview of Dr Brooks by Grady Booch.

He leaves his wife of 66 years Nancy, three children, nine grandchildren and two great-grandchildren.

 

domingo, octubre 30, 2022

Advertencias sobre diseño y microservicios

 Leí hace algunos días un conjunto de observaciones sobre microservicios que me parecieron más que atinados, especialmente cuando los microservicios parecen ser la receta universal para toda empresa. Si revisas las ofertas laborales, desde hace meses estos son la estrella de las solicitudes; y mas o menos parecido cuando se ven las presentaciones empresarias. Sigo los artículos ofrecidos por Medium, y allí es abrumadora su presencia. De hecho, las observaciones que comento se han publicado allí.

¿Realmente los microservicios son una respuesta total? Giedrius Kristinaitis en estas recomendaciones lo pone en duda, y arroja unas buenas paladas de cordura:

What you need to answer yourself is how microservices will help your particular situation. Think about your situation, and don’t blindly copy what big tech companies do, because their domain is most likely different from yours, and they have their own reasons that might not exist for you. You can listen to their general advice, just don’t be like “oh, this company is doing X to solve their Y problem, so we’ll do the same” when you don’t really have a Y problem.

Giedrius recuerda una verdad muy simple: no aplique una plantilla, sino examine su problema:

Saying things like “if we use microservices we’ll be able to reduce development costs, we’ll scale better, etc.” is not a good answer, because it’s very generic and does not explain how.

Here’s what a good answer might look like: “we need to process a lot of batches of X data, however, we can’t do it anymore, we can’t scale because each batch is unnecessarily coupled to process Y which can’t be made any faster, nor does it need to, so we need X to be decoupled from Y”.

Such an answer would tell exactly what problem you’re having and why. Identifying your problem is very important. If you can’t identify your problem you’re at a high risk of making your life too complicated by needlessly starting with microservices.

 El consejo de Giedrius es no precipitarse estableciendo una arquitectura basada en microservicios, sino concentrarse en el problema, especialmente modificando progresivamente el diseño y arquitectura de la aplicación monolítica de la que se parte. Recomienda disminuír el acoplamiento y dependencias entre partes del sistema, quizá extrayendo partes que puedan manejarse como servicios:

(...) if you don’t think about making your system loosely coupled, and if you don’t think about loose coupling, no matter what architecture you choose, it’s probably not gonna work out, microservices included. (...) So if you think that you must start with microservices from the get-go you’re already implying that your services will be too coupled and too static to actually qualify as microservices. If you can have a loosely coupled monolithic system, you will be able to convert it to microservices.

If you can’t have a loosely coupled monolithic system, microservices will make your life even worse, a lot worse.

Giedrius desplaza la atención a resaltar que en primer lugar este paso es un problema de diseño, y que eso es lo que debe estar claro en primer lugar, dejando a un lado decisiones basadas en "porque lo hizo Netflix". Reflexionar acerca del actual diseño y su caos, analizando las prácticas que hubieron de llevar a tenerel desorden que se quiere corregir. Sin este paso, el fallo se repetirá:

The old monolithic system is a huge pile of spaghetti and needs to be rewritten. The biggest mistake you can make in such a situation is not learning from past mistakes. You should sit down and closely inspect what bad (engineering) practices or processes led to the state that it’s in.

If you don’t do that you’re bound to repeat the same mistakes when you rewrite the system. You know what they say, history repeats itself, and the only way to prevent it is to learn about history.

You just can’t rush into a new project with the same engineering practices you used in the old one and expect things to magically turn out different this time around. The old one failed for a lot of reasons, and you can’t ignore them. Everyone working on the new project should be informed about them.

Recomiendo su lectura, y pensar estas observaciones. El artículo es más amplio pero esta es la parte que me interesa particularmente.


Más sobre Meta

 Meta (Facebook) se hunde en la bolsa, con nuevas caídas de su valor:

Meta abrió el mercado bursátil hoy al mismo precio de hace siete años, cuando la compañía aún se llamaba Facebook y parecía tener un enorme futuro por delante. Una caída del 20% del precio de la acción tras la presentación de los últimos datos trimestrales ha barrido todo lo ganado desde entonces y demostrado que algunos gigantes tecnológicos, en realidad, tienen los pies de barro. (...) Las cifras del tercer trimestre, la verdad, son mucho peores de lo que los analistas esperaban. En un año se han evaporado la mitad de los beneficios. El año pasado, al cierre del tercer trimestre, la compañía aseguraba haber ganado cerca de 9.000 millones de dólares. Este año la cifra apenas supera los 4.390 millones, un 52% menos. El beneficio por acción ha caído un 49% a pesar de que los ingresos de la compañía han sido relativamente estables, con una bajada de sólo un 4% que se puede achacar fácilmente al clima económico general. (Ángel Gimenez de Luis, en El Mundo)

 En una época en la que el lucro lo dan tus datos, que si te preguntas cuál es su negocio, éste no es otro que tu conocimiento puesto en venta, entonces su baja puede ser un acontecimiento positivo. Los datos de cada participante son el punto es clave para Meta (y para muchos otros):

Los problemas empezaron, aunque cueste creerlo, con una simple actualización de software. A finales del año pasado Apple introdujo nuevos controles de privacidad en los iPhone que permiten a los usuarios limitar la cantidad de información que las apps en sus teléfonos son capaces de extraer.

Hasta entonces Meta se apoyaba en su omnipresencia digital para elaborar perfiles muy detallados de los usuarios. Recolectaba información no sólo del uso que se hacía de sus propias aplicaciones, sino también muchas otras en las que incluía códigos de seguimiento. Esto le permitía ser muy eficaz -y, por tanto, cobrar más- en el negocio de la publicidad online.

Pero con los nuevos cambios ha perdido una gran ventaja competitiva en su mercado más importante, EEUU, donde la cantidad de personas que usan iPhone es muy alta. No ayuda tampoco que Google haya decidido seguir un camino parecido con Android, restringiendo cada vez más la cantidad y calidad de los datos que muestra a los desarrolladores, salvo que los usuarios opten explícitamente por compartirlos.

 Es que si el negocio es vender aire, y vivir en una meta-realidad, su alcance puede llegar a ser muy frágil, porque probablemente la vida diaria de la sociedad transcurre y transcurrirá en un entorno distinto, no en la ficción:

El otro problema para Meta es que ha decidido apostar su futuro a una sola carta: la realidad virtual. El año pasado anunció su cambio de nombre, justificándolo como un mejor reflejo de sus intenciones. Zuckerberg cree que en un futuro cercano la mayor parte de nuestra vida digital, tanto en los momentos de ocio como de trabajo, transcurrirán en entornos virtuales, algo que, en conjunto, denomina como "el metaverso", de ahí que ahora hablemos de Meta en lugar de Facebook.

 Ya tuvimos un Second Life

Giménez de Luis apunta también a TikTok, que compite en su terreno, quitándole porciones importantes de seguidores. Con el agravante de que TikTok representa a la presencia cada vez mayor de China como competidor por la hegemonía. Mismos o peores objetivos, si tenemos en cuenta el totalitarismo nada virtual chino.


martes, octubre 18, 2022

No comprometa proyectos basados en Google II

 Como hemos dicho antes, la confiabilidad en la continuidad de un proyecto o un producto de Google, tiende a cero. Tanto que existe una página "Killed by Google", con un recuento de productos e iniciativas que en su momento fueron populares y que fueron abandonadas. Decir "abandonadas" quiere decir que lo que alguien hubiera invertido se ha perdido, o a duras salvado con un costo de reingeniería.

Liz Martin en Medium (Why Google Keeps Killing Its Products):

(...) But here’s the thing: killing off projects is part of Google’s innovation process. Many of the Google products that people use today include features from things that no longer exist.

For example, Google Inbox was killed off in 2019 but many of its features migrated over to Gmail. Google Play Music was killed off in 2020, but several of its features are being used in Youtube Music. Google Allo was killed off in 2019, but its best features were ported over to Android Messages.

(...) Google exists in a fast-paced space. The faster the company can fail, the more quickly it can innovate and beat the competition to the newest technological advancement. No matter how chaotic, these calculated risks are the method to Google’s madness.

Question: What do you think Google will kill off next? What product would you like to see Google bring back to life?

 

miércoles, septiembre 21, 2022

Meta en problemas

 En el bloque de las Big Tech, Big Four o Big Five, según las variaciones de criterio para clasificarlas, hay un elemento común que pesa con una masa descomunal sobre la industria tecnológica o sobre su investigación y evolución: la capacidad monopólica de imponer tendencias y torcer el rumbo del desarrollo según sus criterios. En este sentido han perdido su halo primario de tecnológicas "buenas", que gozaron en mayor o menor medida todas ellas en sus comienzos: innovadoras, abiertas, promotoras de la inteligencia y la iniciativa, participantes en toda clase de iniciativas de mejora social. Desde hace años son para las autoridades de Estados Unidos y Europa el centro de revisiones de prácticas monopólicas, y actores de primera línea de lobbismo en favor de sus proyectos, con sanciones que se van acumulando. Dentro de ellas destacan, a mi juicio, dos: Facebook (ahora Meta) y Twitter. Facebook ha sido particularmente escandalosa y expuesta durante la presidencia americana de Donald Trump. Es que con una masa de usuarios participantes cercana a tres mil millones, la capacidad de manipulación es semejante a tener un gobierno que rigiera Estados Unidos, Europa, Rusia y China, y esto es parte de su negocio. 

Sin embargo, por agotamiento o por competencia, ha llegado un momento en que por primera vez no ha crecido, y eso ha activado alarmas. La vía de escape imaginada por su dirección ha sido lanzar Meta con el nuevo paradigma de "Metaverso", "a digital extension of the physical world by social media, virtual reality and augmented reality features" . Meta vende vida virtual, aire en la red, y su proyecto tiene riesgos que este año no parecen ser igual de virtuales. En El Economista:

(...) el pasado febrero, la red social presenta sus resultados y, con ellos, llega el primer periodo en el que los usuarios no aumentan. Esto provoca el hundimiento de la compañía y el mayor desplome en un día en el patrimonio de su fundador, marcando una caída histórica de 31.000 millones en una sesión.

La ausencia de nuevas altas en la plataforma revela dos cosas: la competencia con TikTok y un menor presupuesto publicitario por parte de los anunciantes. En el primer caso, la red social de Zuckerberg ha encontrado una gran rival en la china gracias al éxito de su formato, los vídeos cortos. En el segundo, el deterioro de las condiciones económicas ha lastrado los ingresos de la compañía.

Además, el órdago por el Metaverso ha requerido y seguirá necesitando enormes inversiones, algo que ha pesado en el valor de la compañía este ejercicio. De hecho, el propio Zuckerberg dijo que la nueva propuesta de la tecnológica era deficitaria y que supondría pérdidas durante tres y cinco años. Además, en los últimos tiempos, la antigua Facebook ha sido objeto de un mayor escrutinio regulatorio.

En comparación con sus competidoras, es la que peor rinde en bolsa. Meta Platforms se deja un 57% de valor en lo que va de año, solo superada por Netflix, que pierde un 60%. Sin embargo, las rentabilidades negativas de Apple, Amazon y Alphabet son mucho menos significativas, del -14%, -26% y -29%, respectivamente.
En fin, el darwinismo en la evolución tecnológica también puede alcanzar al T-Rex


domingo, septiembre 11, 2022

No comprometa proyectos basados en Google

En una época en que en la cúspide de la pirámide de proveedores de tecnología, infraestructura, y elaboración de software hay un muy reducido número de participantes (Microsoft, AWS (Apple), Google (Alphabet), Oracle, Facebook (Meta), la confiabilidad en sus servicios debería ser fundamental. Sin embargo, lo efectivo es el manejo monopólico de la evolución y la oferta en el mercado. Es muy común ver una pequeña empresa que destaca por un par de años en un nicho de mercado, hasta que es comprado por algún miembro prominente de la pirámide. Y esto no significa que el hallazgo diferenciador de esta tal empresa sea utilizado de manera multiplicadora por el comprador. Es más probable que marche a vía muerta en otro par de años. Los vendedores festejan el negocio, y quienes hubieron de confiar en la startup y adoptaron su producto, están probablemente perdidos. 

En este marco, Google destaca en un aspecto en particular: investigar, ofrecer un elemento novedoso en algún área de mercado, impulsarlo y entusiasmar a miles de adoptantes, y luego, de un día para otro, avisar que ese producto, proceso, o lo que sea, se discontinuará el año siguiente. Y los miles de usuarios entusiastas, los que demostraban lo importante que el nuevo elemento era, los early birds, tienen que comenzar a planear (a pérdida), cómo saldrán del corral con el menor daño posible. Google Cloud IOT service es su más reciente muestra de arbitrariedad en el manejo del mercado y de sus clientes. Es notable entrar a la página del producto, donde se describen sus servicios y su gran valor, mientras que en la primera línea de la página aparece un sobreescrito que avisa que el servicio se termina el 16 de agosto de 2023.

En InfoQ, donde he visto esta noticia, se dice esto:

Google Cloud IoT Core is a fully-managed service that allows customers to connect, manage, and ingest data from millions of globally dispersed devices quickly and securely. Recently, Google announced discontinuing the service - according to the documentation, the company will retire the service on the 16th of August, 2023. 

The company released the first public beta of IoT Core in 2017 as a competing solution to the IoT offerings from other cloud vendors – Microsoft with Azure IoT Hub and AWS with AWS IoT Core. In early 2018, the service became generally available. Now, the company emailed its customers with the message that "your access to the IoT Core Device Manager APIs will no longer be available. As of that date, devices will be unable to connect to the Google Cloud IoT Core MQTT and HTTP bridges, and existing connections will be shut down." Therefore, the lifespan of the service is a mere five years.

(...) In addition, over the years, various companies have even shipped dedicated hardware kits for those looking to build Internet of Things (IoT) products around the managed service. Cory Quinn, a cloud economist at The Duckbill Group, tweeted:

I bet @augurysys is just super thrilled by their public Google Cloud IoT Core case study at this point in the conversation. Nothing like a public reference for your bet on the wrong horse.

Last year, InfoQ reported on Enterprise API and the "product killing" reputation of the company - where the community also shared their concerns and sentiment.  And again, a year later, Narinder Singh, co-founder, and CEO at LookDeep Health, as an example expressed a similar view in a tweet:

Can't believe how backwards @Google @googlecloud still is with regards to the enterprise.  Yes, they are better at selling now, but they are repeatedly saying through their actions you should only use the core parts of GCP.

 (...) Lastly, already a Google Partner, ClearBlade announced a full-service replacement for the IoT Core with their service, including a migration path from Google IoT Core to ClearBlade. An option for customers, however, in the Hacker News thread, a respondent, patwolf, stated:

I've been successfully using Cloud IoT for a few years. Now I need to find an alternative. There's a vendor named ClearBlade that announced today a direct migration path, but at this point, I'd rather roll my own.

¿Cuántas veces ha pasado esto antes? ¿Qué garantías de prosperar tiene un negocio si ésta es la confiabilidad de su proveedor? Como en un automóvil, utilice una "conducción defensiva", y sepa con quién negocia: tenga un par de vías de escape, y si puede, evite al gigante.

domingo, agosto 07, 2022

Todd Montgomery: Unblocked by design


Leído en InfoQ , que publica una presentacion ofrecida en QCon Plus, en noviembre de 2021. Un punto de vista lejano a cómo he trabajado siempre, pero con argumentos para atenderlo. Todd Montgomery aboga en favor del diseño asincrónico de los procesos. considerando en primer lugar que la secuencialidad es ilusoria:

All of our systems provide this illusion of sequentiality, this program order of operation that we really hang our hat on as developers. We look at this and we can simplify our lives by this illusion, but be prepared, it is an illusion. That's because a compiler can reorder, runtimes can reorder, CPUs can reorder. Everything is happening in parallel, not just concurrently, but in parallel on all different parts of a system, operating systems as well as other things. It may not be the fastest way to just do step one, step two, step three. It may be faster to do steps one and two at the same time or to do step two before one because of other things that can be optimized. By imposing order on that we can make some assumptions about the state of things as we move along. Ordering has to be imposed. This is done by things in the CPU such as the load/store buffers, providing you with this ability to go ahead and store things to memory, or to load them asynchronously. Our CPUs are all asynchronous.

Storages are exactly the same way, different levels of caching give us this ability for multiple things to be optimized along that path. OSs with virtual memory and caches do the same thing. Even our libraries do this with the ideas of promises and futures. The key is to wait. All of this provides us with this illusion that it's ok to wait. It can be, but that can also have a price, because the operating system can de-schedule. When you're waiting for something, and you're not doing any other work, the operating system is going to take your time slice. It's also lost opportunity to do work that is not reliant on what you're waiting for. In some application, that's perfectly fine, in others it's not. By having locks and signaling in that path, they do not come for free, they do impose some constraints.

 Ubicando el contexto primero: 

When we talk about sequential or synchronous or blocking, we're talking about the idea that you do some operation. You cannot continue to do things until something has finished or things like that. This is more exaggerated when you go across an asynchronous binary boundary. It could be a network. It could be sending data from one thread to another thread, or a number of different things. A lot of these things make it more obvious, as opposed to asynchronous or non-blocking types of designs where you do something and then you go off and do something else. Then you come back and can process the result or the response, or something like that.

Cómo ve la sincronía:

I'll just use as an example throughout this, because it's easy to talk about, the idea of a request and a response. With sync or synchronous, you would send a request, there'll be some processing of it. Optionally, you might have a response. Even if the response is simply just to acknowledge that it has completed. It doesn't always have to involve having a response, but there might be some blocking operation that happens until it is completed. A normal function call is normally like this. If it's sequential operation, and there's not really anything else to do at that time, that's perfectly fine. If there are other things that need to be done now, or it needs to be done on something else, that's a lost opportunity.

Y la asincronía:

Async is more about the idea of initiating an operation, having some processing of it, and you're waiting then for a response. This could be across threads, cores, nodes, storage, all kinds of different things where there is this opportunity to do things while you're waiting for the next step, or that to complete or something like that. The idea of async is really, what do you do while waiting? It's a very big part of this. Just as an aside, when we talk about event driven, we're talking about actually the idea of on the processing side, you will see a request come in. We'll denote that as OnRequest. On the requesting side, when a response comes in, you would have OnResponse, or OnComplete, or something like that. We'll use these terms a couple times throughout this.

 El propósito de Montgomery es procesar asincronicamente, y sacar partido de los tiempos muertos:

The key here is while something is processing or you're waiting, is to do something, and that's one of the takeaways I want you to think of. It's a lost opportunity. What can you do while waiting and make that more efficient? The short answer is, while waiting, do other work. Having the ability to actually do other stuff is great. The first thing is sending more requests, as we saw. The sequence here is, how do you distinguish between the requests? The relationship here is you have to correlate them. You have to be able to basically identify each individual request and individual response. That correlation gives rise to having things which are a little bit more interesting. The ordering of them starts to become very relevant. You need to figure out things like how to handle things that are not in order. You can reorder them. You're just really looking at the relationship between a request and a response and matching them up. It can be reordered in any way you want, to make things simple. It does provide an interesting question of, what happens if you get something that you can't make sense of. Is it invalid? Do you drop it? Do you ignore it? In this case, you've sent request 0, and you've got a response for 1. In this point, you're not sure exactly what the response for 1 is. That's handling the unexpected.

(...) This is an async duty cycle. This looks like a lot of the duty cycles that I have written, and I've seen written and helped write, which is, you're basically sitting in a loop while you're running. You usually have some mechanism to terminate it. You usually poll inputs. By polling, I definitely mean going to see if there's anything to do, and if not, you simply return and go to the next step. You poll if there's input. You check timeouts. You process pending actions. The more complicated work is less in the polling of the inputs and handling them, it's more in the checking for timeouts, processing pending actions, those types of things. Those are a little bit more complex. Then at the end, you might idle waiting for something to do. Or you might just say, ok, I'm going to sleep for a millisecond, and you come right back. You do have a little bit of flexibility here in terms of idling, waiting for something to do.

 Realmente, estos conceptos parecen complicados de aplicar en un proceso usual de trabajo, y más viables en la construcción de trabajos de nivel de sistema operativo. El interlocutor de Montgomery (Printezis) lo ve justamente así: You did talk about the duty cycle and how you would write it. In reality, how much a developer would actually write that, but instead use a framework that will do most of the work for them?

La respuesta de Montgomery:

(...) Beyond that, I mean, patterns and antipatterns, I think, learning queuing theory, which may sound intimidating, but it's not. Most of it is fairly easy to absorb at a high enough level that you can see far enough to help systems. It is one of those things that I think pays for itself. Just like learning basic data structures, we should teach a little bit more about queuing theory and things behind it. Getting an intuition for how queues work and some of the theory behind them goes a huge way, when looking at real life systems. At least it has for me, but I do encourage people to look at that. Beyond that, technologies frameworks, I think by spending your time more looking at what is behind a framework. In other words, the concepts, you do much better than just looking at how to use a framework. That may be front and center, because that's what you want to do, but go deeper. Go deeper into, what is it built on? Why does it work this way? Why doesn't it work this other way? Asking those questions, I think you'll learn a tremendous amount. (...)

La conversación se extiende y deriva por otros asuntos relacionados. Recomendable para leer y releer. Habrá que volver más de una vez.

Veo un modo de afrontar los procesos alejado del modo en que usualmente he trabajado, pero debo reconocer que en los últimos cinco o seis años los cambios conceptuales sobreabundan, y puedo decir que estamos en una quinta o sexta generación, lejos de aquellos que llamamos cuarta generación hace veinte o treinta años. El tiempo mostrará qué ha resultado duradero, y qué ha tomado por un callejón sin salida. Estoy dispuesto a escuchar.

 


Pesadillas en la nube

 Forrest Brazeal, actualmente empleado de Google Cloud (An AWS Hero turned Google Cloud employee, I explore the technical and philosophical differences between the two platforms. My biases are obvious, but opinions are my own) señala en julio que la peor pesadilla de cualquier desarrollador en la nube es una llamada recursiva en sus pruebas, que escale la facturación de su cuenta de unos pocos dólares/euros a "miles" (50.000 por ejemplo). Y una llamada recursiva que genere miles de llamadas procesadas puede producirse en cualquier prueba:

AWS calls it the recursive runaway problem. I call it the Hall of Infinite Functions - imagine a roomful of mirrors reflecting an endless row of Lambda invocations. It’s pretty much the only cloud billing scenario that gives me nightmares as a developer, for two reasons:

  • It can happen so fast. It’s the flash flood of cloud disasters. This is not like forgetting about a GPU instance and incurring a few dollars per hour in linearly increasing cost. You can go to bed with a $5 monthly bill and wake up with a $50,000 bill - all before your budget alerts have a chance to fire.

  • There’s no good way to protect against it. None of the cloud providers has built mechanisms to fully insulate developers from this risk yet.

Brazeal apunta a un incidente descripto en detalle por sus propias víctimas (We Burnt $72K testing Firebase + Cloud Run and almost went Bankrupt) que puede dar una idea del problema. En este caso la factura pasó de un potencial de 7 dólares a 72000...

Sudeep Chauhan, protagonista de este incidente, escribe posteriormente, tras poner en orden la casa, una lista de recomendaciones para trabajar con un proveedor de servicios en la nube.

Nota: Renato Losio, en InfoQ, a propósito del artículo de Brazeal, lo menciona y extiende, recordando otro artículo de Brazeal dedicado a la capa sin cargo de AWS.


sábado, agosto 06, 2022

Probablemente usted no necesite microservicios

 Mattew Spence, en ITNEXT, a contracorriente de la enorme ola de bombo sobre microservicios, desarrolla un consistente conjunto de argumentos de relativización de la importancia y necesidad de microservicios (You don't need microservices) . Sólo destaco el argumento acerca de la simplicidad de los microservicios, y de sus ventajas derivadas:

"Simpler, Easier to Understand Code"

This benefit is at best disingenuous, at worse, a bald-faced lie.

Each service is simpler and easier to understand. Sure. The system as a whole is far more complex and harder to understand. You haven’t removed the complexity; you’ve increased it and then transplanted it somewhere else.

(...) Although microservices enforce modularization, there is no guarantee it is good modularization. Microservices can easily become a tightly coupled “distributed monolith” if the design isn’t fully considered.

(...) The choice between monolith and microservices is often presented as two mutually exclusive modes of thought. Old school vs. new school. Right or wrong. One or the other.

The truth is they are both valid approaches with different trade-offs. The correct choice is highly context-specific and must include a broad range of considerations.

The choice itself is a false dichotomy and, in certain circumstances, should be made on a feature-by-feature basis rather than a single approach for an entire organization’s engineering team.

Should you consider microservices?

As is often the case, it depends. You might genuinely benefit from a microservices architecture.

There are certainly situations where they can pay their dues, but if you are a small to medium-sized team or an early-stage project:

No, you probably don’t need microservices.

 

domingo, julio 31, 2022

Liam Allan habla sobre Node en IBM i

 Liam Allan, como Scott Klement, han dado un impulso formidable al IBM i (AKA AS/400, iseries), explorando, popularizando y explotando los sucesivos cambios tecnológicos habidos en el equipo desde hace años. El comentario sobre Node lo hace Liam en la entrevista que Charles Guarino le hace en TechChannel. La participación de Liam, reciente, ha implicado cambios radicales en el modo de encarar al IBM i, comenzando por su editor de programas. Debemos decir que el ambiente y las prácticas relacionadas con el IBM i históricamente han sido más vale conservadoras, apropiadas para un set de equipos que solía ser el núcleo del procesamiento de las empresas que lo usaban. Dice Guarino sobre este aspecto: I still think there’s still a lot of newbies—even the most seasoned RPG developers are still newbies—and open-source makes them nervous, perhaps because it’s a whole different paradigm, a whole different vernacular. Everything about it is different, yet obviously there are so many similarities, but the terminology is very different. Klement y quienes lo siguieron, y ahora Allan, han representado una renovación y actualización más que conveniente,  necesaria.

Por mi parte, dándole vueltas a su uso con Plex. Ya Klement ha potenciado su integración con sus propuestas a nivel de integración de lenguajes java y c/c++ a través de ILE.

 Lo dicho sobre Node:

Charlie: (...) So Liam, I do have a lot of things that I want to talk to you about, but when I think of you lately what comes to my mind is Node. I mean I kind of associate you with just Node and how you really are really running with that technology, especially on IBM i, but I think there are a lot of people who don’t quite understand where that fits in, what Node actually is and how it fits on your platform. So what can you say about that in general?

Liam: Absolutely. So I mean, there’s a few points to be made. I guess I’ll start with the fact that you know, it is 80% of my working life is writing typescript and Javascript. So I spend most of my days in it now, which is great. A few years ago, it was more like 50% and each year it’s growing more and more. So I usually focus on how it can integrate with IBM i. So you know having Node.js code, whether it’s typescript or Javascript talking to IBM i via the database—so, calling programs, fetching data, updating data; you know, the minimal standard kind of driver type stuff that you do, crud, things like that. What I especially like about Node on IBM i is that it is made for high input/outputs. It’s great at handling large volumes of data and most people that are using IBM i tend to have tons of data, right? Db2 for i has been around for centuries at this point; it’s older than I am, and I can make that joke. No one else can make that joke but I can make it and you know it’s been around for the longest time. And so people have got all of this data and in my opinion Node.js is just a great way to express that data—you know, via an API. I think it’s fast. It’s got high throughput and yeah, it’s a synchronous in its standard. It’s easy to use, it’s easy to deploy, it’s easy to write code for especially. One of the reasons I like is the fact that I can have something working within 20 minutes. It’s a fantastic piece of technology and it’s been out for a while. I mean it’s been out for like 10 years, 10 years plus at this point. It’s just fun to use. I really enjoy it and I encourage other people to use it too.

domingo, julio 03, 2022

El concepto "Legacy" y la zanahoria "microservice"

 Lo que sigue es un artículo "viejísimo", del 25 de abril de este año. Lo copiaré y comentaré si es necesario, porque sigue siendo de rigurosa actualidad, tanto en el universo de IBM i, como en general:

Beware The Hype Of Modern Tech

Many IBM i shops are under the gun to modernize their applications as part of a digital transformation initiative. If the app is more than 10 or 15 years old and doesn’t use the latest technology and techniques, it’s considered a legacy system that must be torn down and rebuilt according to current code. But there are substantial risks associated with these efforts – not the least of which that the modern method is essentially incompatible with the IBM i architecture as it currently exists. IBM i shops should be careful when evaluating these new directions.

Amy Anderson, a modernization consultant working in IBM’s Rochester, Minnesota, lab, says she was joking last year when she said “every executive says they want to do containerized microservices in the cloud.” If Anderson is thinking about a future in comedy, she might want to rethink her plans, because what she says isn’t a joke; it’s the truth.

Many, if not most, tech executives these days are fully behind the drive to run their systems as containerized microservices in the cloud. They have been told by the analyst firms and the mainstream tech press and the cloud giants that the future of business IT is breaking up monolithic applications into lots of different pieces that communicate through microservices, probably REST. All these little apps will live in containers, likely managed by Kubernetes, enabling them to scale up and down seamlessly on the cloud, likely AWS or Microsoft Azure.

The “containerized microservices in the cloud” mantra has been repeated so often, many just accept it as the gospel truth. Of course that is the future of business tech! they say. How else could we possibly run all these applications? It’s accepted as an article of faith that this is the right approach. Whether a company is running homegrown software or a packaged app, they’re adamant that the old ways must be left behind, and to embrace the glorious future that is containerized microservices running in the cloud.

 The reality is that the supposedly glorious future is today is a pipe dream, at least when it comes to IBM i. Let’s start with Kubernetes, the container orchestration system open sourced by Google in 2014, which is a critical component of running in the “cloud native” way. (...)

While Kubernetes solves one problem – eliminating the complexity inherent in deploying and scaling all the different components that go into a given application – it introduces a lot more complexity to the user. Running a Kubernetes cluster is hard. If you’ve talked to anybody who has tried to do it themselves, you’ll quickly find out that it’s extremely difficult. It requires a whole new set of skills that most IT professionals do not have. The cloud giants, of course, have these folks in droves, but they’re practically non-existent everywhere else.

ISVs are eager to adopt Kubernetes as the new de facto operating system for one very good reason: because it helps them run their applications on the cloud. (...) 

For greenfield development, the cloud can make a lot of sense. Customers can get up and running very quickly on a cloud-based business application, and leave all the muss and fuss of managing hardware to the cloud provider. But there are downsides too, such as no ability to customize the application. For the vendors, the fact that customers cannot customize goes hand in hand with their inability to fall behind on releases. (Surely the vendor passes whatever benefit it receives through collective avoidance of technical debt back to you, dear customer.)

The Kubernetes route makes less sense for established products with an established installed base. It takes quite a bit of work to adapt an existing application to run inside a Docker container and have it managed in a Kubernetes pod. It can be done, but it’s a heavy lift. But when it comes to critical transactional systems, it likely becomes more of a full-blown re-implementation than a simple upgrade. There are no free lunches in IT.

When it comes to IBM i, lots of existing customers who are running their ERP systems on-prem are not ready to move their production business applications to the cloud. Notice what happened when Infor stopped rolling out enhancements for the M3 applications for IBM i customers. Infor wanted these folks to adopt M3 running on X86 servers running in AWS cloud. Many of them balked at this forced re-implementation, and now Infor is rolling out a new offering called CM3 that recognizes that customers want to keep their data on prem in their Db2 for i server.

Other ERP vendors have taken a similar approach to the cloud. SAP wants its Business Suite customers to move to S/4 HANA, which is a containerized, microservice-based ERP running in the cloud. The German ERP giant has committed to supporting on-prem Business Suite customers until 2027, and through 2030 with an extended maintenance agreement. After that, the customers must be on S/4 HANA, which at this point doesn’t run on IBM i.

Will the 1,500-plus customers who have benefited from running SAP on IBM i for the past 30 years be willing to give up their entire legacy and begin anew in the S/4 HANA cloud? It sounds like a risky proposition, especially given the fact that much of the functionality that currently exists in Business Suite has yet to be re-constructed din S/4 HANA. Is this an acceptable risk?

Kubernetes is just part of the problem, but it’s a big one, because at this point IBM i doesn’t support Kubernetes. It’s not even clear what Kubernetes running on IBM i would look like, considering all the virtualization features that already exist in the IBM i and Power platform. (What would become of LPARs, subsystems, and iASPs? How would any of that work?) In any event, the executives in charge of IBM i have told IT Jungle there is no demand for Kubernetes among IBM i customers. But that could change.

Particularmente interesante es el comentario acerca de los planes de Jack Henry & Associates:

Jack Henry & Associates officially unleashed its long-term roadmap earlier this year, but it had been working on the plan for years. The company has been a stalwart of the midrange platform for decades, reliably processing transactions for more than a thousand banks and credit unions running on its RPG-based core banking systems. It is also one of the biggest private cloud providers in the Power Systems arena, as it runs the Power machinery powering (pun intended) hundreds of customer applications.

The future roadmap for Jack Henry is (you guessed it) containerized microservices in the cloud. The company explains that it doesn’t make sense to develop and maintain about 100 duplicate business functions across four separate products, and so it will slowly replace those redundant components that today make up its monolithic packages like Silverlake with smaller, bite-sized components that run in the cloud-native fashion on Kubernetes and connect and communicate via microservices.

It’s not a bad plan, if you’ve been listening to the IT analysts and the press for the past five years. Jack Henry is doing exactly what they’ve been espousing as the modern method. But how does it mesh with its current legacy? The reality is that none of Jack Henry’s future software will be able to run on IBM i. Db2 for i is not even one of the long-term options for a database; instead it selected PostgreSQL, SQL Server, and MongoDB (depending on which cloud the customer is running in).

Jack Henry executives acknowledge that there’s not much overlap between its roadmap and the IBM i roadmap at this point in time. But they say that they’re moving slowly and won’t have all of the 100 or so business functions fully converted into containerized microservices for 15 years – and then it will likely take another 15 years to get everybody moved over. So it’s not a pressing issue at the moment.

Maybe Kubernetes will run on IBM i by then? Maybe there will be something new and different that eliminates the technological mismatch? Who knows?

The IBM i system is a known entity, with known strengths and weaknesses. Containerized microservices in the cloud is an unknown entity, and its strengths and weaknesses are still being determined. While containerized microservices running in the cloud may ultimately win out as the superior platform for business IT, that hasn’t been decided yet.

For the past 30 years, the mainstream IT world has leapt from one shiny object to the next, convinced that it will be The Next Big Thing. (TPM, the founder of this publication and its co-editor with me, has a whole different life as a journalist and analyst chasing this, called The Next Platform, not surprisingly.) Over the same period, the IBM i platform has continued more or less on the same path, with the same core architecture, running the same types of applications in the same reliable, secure manner.

The more hype is lavished upon containerized microservices in the cloud, the more it looks like just the latest shiny object, which will inevitably be replaced by the next shiny object. Meanwhile, the IBM i server will just keep ticking.

 Sin duda han habido cambios espectaculares en unos pocos años, los últimos cuatro o cinco, y existen herramientas y recursos disponbiles de gran potencia. Pero para una empresa o institucion en marcha, un cambio tiene que ser pesado con cuidado, evitando el riesgo de caer en el vacío. ¿Un cambio que requiere nuevas metodologías, nuevos lenguajes, nuevas platatformas, nuevas comunicaciones? ¿desarrollos con lo último de lo último, sin contar con la prueba de recursos robustos y experimentados por varios años?

domingo, junio 05, 2022

China, Gitee, GitHub

 En Technology Review, del MIT, el 30 de mayo, escribe Zeyi Yang

Earlier this month, thousands of software developers in China woke up to find that their open-source code hosted on Gitee, a state-backed Chinese competitor to the international code repository platform GitHub, had been locked and hidden from public view.

Gitee released a statement later that day explaining that the locked code was being manually reviewed, as all open-source code would need to be before being published from then on. The company “didn’t have a choice,” it wrote. Gitee didn’t respond to MIT Technology Review, but it is widely assumed that the Chinese government had imposed yet another bit of heavy-handed censorship.

For the open-source software community in China, which celebrates transparency and global collaboration, the move has come as a shock. Code was supposed to be apolitical. Ultimately, these developers fear it could discourage people from contributing to open-source projects, and China’s software industry will suffer as a result

 Una nueva muestra de la dependencia de grandes actores existente en el mundo Open Source en primer lugar. Pero yendo más lejos, una indicación de la limitada capacidad de elección existente en el mundo de la tecnología y de las ideas y culturas transportadas por su medio. El problema descubierto por los desarrolladores chinos con su propio repositorio "oficial" puede repetirse potencialmente en el mundo occidental, bajo el sello de las grandes tecnológicas que dominan directa o indirectamente los repositorios abiertos, "públicos", y las infraestructuras y servicios en la nube. Ni Google, ni Microsoft, ni Amazon han demostrado neutralidad en su historia, y son protagonistas de décadas de juicios por prácticas desleales. Confiar tu base de código, o tus aplicaciones en este marco no es lo más apropiado, probablemente.

domingo, abril 24, 2022

Computacion cuántica o bombo?


 Sankar Das Sarma, investigador en Física, director del CMTC (Condensed Matter Theory Center) de la Universidad de Maryland, publica en Technology Review un artículo enfriando las expectativas en computación cuántica

A decade and more ago, I was often asked when I thought a real quantum computer would be built. (It is interesting that I no longer face this question as quantum-computing hype has apparently convinced people that these systems already exist or are just around the corner).  My unequivocal answer was always that I do not know. Predicting the future of technology is impossible—it happens when it happens. One might try to draw an analogy with the past. It took the aviation industry more than 60 years to go from the Wright brothers to jumbo jets carrying hundreds of passengers thousands of miles. The immediate question is where quantum computing development, as it stands today, should be placed on that timeline. Is it with the Wright brothers in 1903? The first jet planes around 1940? Or maybe we’re still way back in the early 16th century, with Leonardo da Vinci’s flying machine? I do not know. Neither does anybody else.

Sobre el trabajo de Sarma, una lista de papeles en los que participa. 

Sobre el CMTC, y su trabajo, mencionando su colaboración con Microsoft.

Sobre el estado las investigaciones de la materia condensada (Condensed matter physics), en  Wikipedia.

La foto, tomada del blog de Microsoft sobre computacion cuantica.


domingo, abril 17, 2022

Mary Poppendieck mirando en perspectiva

 En QCon Plus, una conferencia virtual de InfoQ, se presenta una conversación con Mary Poppendieck (en la charla también interviente Tom, su esposo y socio en su largo trabajo de consultoría). Mary presenta una visión de los cambios producidos en la construcción de software comenzando con el nuevo siglo: veinte años de cambios radicales que alteraron los paradigmas en que nos hemos basado por décadas. Me parece una visión en perspectiva de particular interés, considerando su propio trabajo en 3M iniciado con Six Sigma, y su propio entendimiento de los conceptos agiles. Mary habla de puentes; ella misma lo es, acompañando el cambio establecido.

Descaminados

 En Technology Review

On March 14, Shah and his University of Washington colleague Emily M. Bender, who studies computational linguistics and ethical issues in natural-language processing, published a paper that criticizes what they see as a rush to embrace language models for tasks they are not designed to address. In particular, they fear that using language models for search could lead to more misinformation and more polarized debate. 

“The Star Trek fantasy—where you have this all-knowing computer that you can ask questions and it just gives you the answer—is not what we can provide and not what we need,” says Bender, a coauthor on the paper that led Timnit Gebru to be forced out of Google, which had highlighted the dangers of large language models.