Skip to main content

ADR - public S3 bucket access

· 4 min read
Paweł Mantur
Solutions Architect
info

This post is an example of ADR - Architecture Decision Record. ADRs document the reasons behind architectural decisions. ADRs promote more transparent and fact-based decision making culture. ADRs are also an useful artefact from technical documentation perspective.

Context

When implementing this website (see related article for architecture overview), among other decisions, I needed to decide on how to setup AWS S3 bucket that hosts website files.

Decision

Although it is considered to be potentially unsafe, I have decided for Option 2: allowing public access 🙀. But let's not panic, I will explain why it is safe in this case.

Considered Options

❌ Option 1: Blocking public access to S3 bucket

  • Pros:
    • Safer option - zero trust is a holy rule of security, especially for public access. AWS console by default blocks public access when creating S3 buckets, it is also possible to block public access for whole AWS account. Moreover AWS IAM Access Analyzer tool reports public access to buckets as a security threat finding. AWS console and documentation really forces to think about that decision.
  • Cons - solution gets more complicated:
    • Hosting a static website using Amazon S3 functionality of S3 cannot be used, as it requires public access
    • since S3 website endpoint cannot be created, Cloud Front needs to access S3 REST API as origin
    • Docusaurus build creates a directory for each page/article and puts a single index.html file inside that directory. For better SEO, and nicer urls, we do not have index.html in links. Web servers know that in such case (when url points to a directory in webserver) index.html needs to be served. S3 website endpoint also knows how to handle that properly. But since we are using S3 REST API under the hood, it just follows the request as-is - if request is for https://pawelmantur.pl/blog/s3-public-bucket it finds out tha there is blog/s3-public-bucket directory in my s3, but since public access does not have directory listing permissions granted - it returns 403. To return index.html, as a web-server like nginx would do - we need to add CloudFront Function to handle this case and modify request URL to append index.html to the URL.
    • It is possible and relatively simple to try, but it requires introducing new components to the solution that can be avoided by leveraging existing AWS S3 website endpoints capability

✅ Option 2: Allowing public access to perform s3:GetObject action

  • Pros:
    • The bucket's only purpose is to host website files and this website is public by design, I want people to access my blog, I want it to be public.
    • S3 website endpoint can be used
    • Static website built with Docusaurus works properly out of the box with S3 website endpoint, no function needed
    • no additional cost related to CloudFront Functions, although it is ignorable for thi website
    • Simpler setup with less components means less room for errors
  • Cons:
    • There is a risk that I will put some confidential files into this bucket, but the risk is mitigated by automation: the only way in which this bucket is updated is by GH Action, it syncs only the build directory of Docusaurus website, which by design is public. Well defined, automated and tested process.

Consequences

Since public s3 buckets were introduced to architecture, I will have to be cautious what content I am putting there and what are the policies granted for public access. But since we are talking about a single person doing a blog, we can agree that related risks can be accepted.

If that decision was made in context of large organization, the use cases for public sharing would need to have special governance. If organization has no business workflows that would require public file sharing and different solutions are available for static websites hosting, then public access to S3 should not be allowed to avoid a risky setup.

References

Watch out for Kafka costs

· 3 min read
Paweł Mantur
Solutions Architect

Not obvious Confluent Cloud costs

When using AWS or any other cloud we need to be aware about network traffic charges, especially cross-zone and cross-region data transfer fees.

Example deployment:

  • Cluster in Confluent Cloud, hosted in AWS (1 CKU is limited to single AZ, from 2 CKUs we have multi-AZ setup)
  • AWS Private Link for private connectivity with AWS
  • Kafka clients running in AWS EKS (multi-AZ)

Cross-AZ data transfer costs

Be aware that if Kafka broker node happens to be running in a different AZ than Kafka clients, then additional data transfer charges will apply for cross-AZ traffic.

Kafka has the concept of Racks that allows to co-locate Kafka clients and broker nodes. More details about this setting in context of AWS and Confluent can be fond here: https://docs.confluent.io/cloud/current/networking/fetch-from-follower.html

Data transfer costs within AZ

But even if we manage to keep connections within same AZ, is consuming data from Kafka for free?

Imagine architecture in which single topic contains data dedicated to multiple consumers. Every consumer reads only relevant data and filters-out (ignores) other messages. Sounds straightforward, but we need to be aware that each consumer to filter data, first needs to read the message. So even not relevant data creates traffic from broker to clients.

Kafka does not support filtering on broker side. There is open feature request for that.

If we have a lot of consumers we will have a lot of outbound traffic (topic throughput x number of consumers). Having additional infrastructure like AWS Private Lnk for such traffic will generate extra costs.

Extreme scenario - generating costs for nothing

Another interesting scenario is implementing a retry policy when message processing fails. For example when every message needs to be delivered to an endpoint which is down. If Kafka consumer tries to deliver the message very aggressively (for example every second or even worse in an infinite loop), and every retry is a new read from topic, then we can easily generate a lot of reads.

We may be fooled by most of the documentation that states that reading from Kafka is very efficient as it is basically about reading sequentially from log. From broker costs perspective, multiple consumers is not a significant costs factor compared to things like written data volumes, but we still need to be mindful of data transfer costs that may apply for reads. Confluent charges 0.05$/GB for Egress traffic. Total costs may grow quickly in a busy cluster with active producers and multiple reads of every message.

Schema Definition Formats

· 4 min read
Paweł Mantur
Solutions Architect
AI Friend
Assistant

Schema Definition Formats: JSON Schema, Avro, and Protocol Buffers

In data management, maintaining a specific structure is key for consistency and interoperability. Three popular schema formats are JSON Schema, Avro, and Protocol Buffers. Each has unique features and use cases. Let's explore their strengths and applications.

JSON Schema

Overview: JSON Schema is a powerful tool for validating the structure of JSON data. It allows you to define the expected format, type, and constraints of JSON documents, ensuring that the data adheres to a predefined schema.

Key Features:

  • Validation: JSON Schema provides a robust mechanism for validating JSON data against a schema. This helps in catching errors early and ensuring data integrity.
  • Documentation: The schema itself serves as a form of documentation, making it easier for developers to understand the expected structure of the data.
  • Interoperability: JSON Schema is widely supported across various programming languages and platforms, making it a versatile choice for many applications.

Use Cases:

  • API Validation: Ensuring that the data exchanged between client and server adheres to a specific format.
  • Configuration Files: Validating configuration files to ensure they meet the required structure and constraints.
  • Data Exchange: Facilitating data exchange between different systems by providing a clear contract for the data format.

Example:

{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Product",
"type": "object",
"properties": {
"id": {
"type": "integer"
},
"name": {
"type": "string"
},
"price": {
"type": "number"
}
},
"required": ["id", "name", "price"]
}

Avro

Overview: Avro is a data serialization system that provides a compact, fast, and efficient format for data exchange. It is particularly well-suited for big data applications and is a key component of the Apache Hadoop ecosystem.

Key Features:

  • Compact Serialization: Avro uses a binary format for data serialization, which is more compact and efficient compared to text-based formats like JSON.
  • Schema Evolution: Avro supports schema evolution, allowing you to update the schema without breaking compatibility with existing data.
  • Interoperability: Avro schemas are defined using JSON, making them easy to read and understand. The binary format ensures efficient data storage and transmission.

Use Cases:

  • Big Data: Avro is widely used in big data applications, particularly within the Hadoop ecosystem, for efficient data storage and processing.
  • Data Streaming: Avro is commonly used in data streaming platforms like Apache Kafka for efficient data serialization and deserialization.
  • Inter-Service Communication: Facilitating communication between microservices by providing a compact and efficient data format.

Example:

{
"type": "record",
"name": "User",
"fields": [
{"name": "id", "type": "int"},
{"name": "name", "type": "string"},
{"name": "email", "type": "string"}
]
}

Protocol Buffers (Protobuf)

Overview: Protocol Buffers, developed by Google, is a language-neutral, platform-neutral, extensible mechanism for serializing structured data. It is known for its efficiency and performance.

Key Features:

  • Compact and Efficient: Protobuf uses a binary format that is both compact and efficient, making it suitable for high-performance applications.
  • Language Support: Protobuf supports multiple programming languages, including Java, C++, and Python.
  • Schema Evolution: Protobuf supports backward and forward compatibility, allowing for schema evolution without breaking existing data.

Use Cases:

  • Inter-Service Communication: Commonly used in microservices architectures for efficient data exchange.
  • Data Storage: Suitable for storing structured data in a compact format.
  • RPC Systems: Often used in Remote Procedure Call (RPC) systems like gRPC.

Example:

syntax = "proto3";

message Person {
int32 id = 1;
string name = 2;
string email = 3;
}

Conclusion

JSON Schema, Avro, and Protocol Buffers each offer powerful tools for managing data schemas, each with its unique strengths. JSON Schema excels in validation and documentation, making it ideal for APIs and configuration files. Avro provides efficient serialization and schema evolution, making it a preferred choice for big data and streaming applications. Protocol Buffers offer compact and efficient serialization, making them suitable for high-performance applications and inter-service communication. Understanding the strengths and use cases of each format can help you choose the right tool for your specific needs.

Change Data Capture and Debezium

· 4 min read
Paweł Mantur
Solutions Architect
AI Friend
Assistant

Understanding Change Data Capture (CDC) and Debezium

In today's data-driven world, keeping track of changes in data is crucial for maintaining data integrity, enabling real-time analytics, and ensuring seamless data integration across systems. Change Data Capture (CDC) is a powerful technique that addresses this need by capturing and tracking changes in data as they occur. One of the most popular tools for implementing CDC is Debezium. This blog article delves into the concept of CDC, its importance, and how Debezium can be used to implement it effectively.

What is Change Data Capture (CDC)?

Change Data Capture (CDC) is a process that identifies and captures changes made to data in a database. These changes can include inserts, updates, and deletes. Once captured, the changes can be propagated to other systems or used for various purposes such as data warehousing, real-time analytics, and data synchronization.

Why is CDC Important?

  1. Real-Time Data Integration:

    • CDC enables real-time data integration by capturing changes as they happen and propagating them to other systems. This ensures that all systems have the most up-to-date information.
  2. Efficient Data Processing:

    • By capturing only the changes rather than the entire dataset, CDC reduces the amount of data that needs to be processed and transferred. This leads to more efficient data processing and reduced latency.
  3. Data Consistency:

    • CDC helps maintain data consistency across different systems by ensuring that changes made in one system are reflected in others. This is particularly important in distributed systems and microservices architectures.
  4. Historical Data Analysis:

    • CDC allows for the capture of historical changes, enabling organizations to perform trend analysis and understand how data has evolved over time.

Introducing Debezium

Debezium is an open-source CDC tool that supports various databases such as MySQL, PostgreSQL, MongoDB, and more. It reads changes from transaction logs and streams them to other systems, making it a powerful tool for implementing CDC.

Key Features of Debezium:

  • Wide Database Support: Debezium supports multiple databases, making it versatile and suitable for various environments.
  • Kafka Integration: Debezium integrates seamlessly with Apache Kafka, allowing for efficient streaming of changes.
  • Schema Evolution: Debezium handles schema changes gracefully, ensuring that changes in the database schema do not disrupt data capture.
  • Real-Time Processing: Debezium captures and streams changes in real-time, enabling real-time data integration and analytics.

How Debezium Works

Debezium works by reading the transaction logs of the source database. These logs record all changes made to the data, including inserts, updates, and deletes. Debezium connectors capture these changes and stream them to a Kafka topic. From there, the changes can be consumed by various applications or systems.

Steps to Implement CDC with Debezium:

  1. Set Up Kafka:

    • Install and configure Apache Kafka, which will be used to stream the changes captured by Debezium.
  2. Deploy Debezium Connectors:

    • Deploy Debezium connectors for the source databases. Each connector is responsible for capturing changes from a specific database.
  3. Configure Connectors:

    • Configure the connectors with the necessary settings, such as the database connection details and the Kafka topic to which the changes should be streamed.
  4. Consume Changes:

    • Set up consumers to read the changes from the Kafka topics and process them as needed. This could involve updating a data warehouse, triggering real-time analytics, or synchronizing data across systems.

Example Configuration

Here is a basic example of configuring a Debezium connector for a MySQL database:

{
"name": "mysql-connector",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.hostname": "localhost",
"database.port": "3306",
"database.user":

"

debezium",
"database.password": "dbz",
"database.server.id": "184054",
"database.server.name": "fullfillment",
"database.include.list": "inventory",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "schema-changes.inventory"
}
}

In this configuration:

  • connector.class specifies the Debezium connector class for MySQL.
  • database.hostname, database.port, database.user, and database.password provide the connection details for the MySQL database.
  • database.server.name is a logical name for the database server.
  • database.include.list specifies the databases to capture changes from.
  • database.history.kafka.bootstrap.servers and database.history.kafka.topic configure the Kafka settings for storing schema history.

Conclusion

Change Data Capture (CDC) is a vital technique for modern data management, enabling real-time data integration, efficient data processing, and maintaining data consistency across systems. Debezium is a powerful open-source tool for implementing CDC, offering wide database support, seamless Kafka integration, and real-time processing capabilities. By leveraging Debezium, organizations can capture and propagate data changes effectively, ensuring that their systems are always up-to-date and ready for real-time analytics and decision-making.

Low code solution with Azure Logic Apps and PowerBI

· 4 min read

I've been recently working on a small project to support a new business process. Time to market was critical for the customer, to be able to capture emerged businesss oppurtunity. Budget was also tight, to not over-invest before bussiness case is validated.

There was a strong preference from the customer to do the whole data management via Excel to "keep it simple". Not a surprise preference when you talk to sales people or CEO as in this case :) There was also a need to enrich data with information from 3rd party systems and provide a number of reports.

High level architecture of this small system looks like this:

It was not the goal to avoid coding at all when building the solution, but the goal was to have a low code approach to save time.

The biggest saving was avoiding custom UI development completely, but still having the solution highly interactive from the users' perspective. Please find below the description of how it was achieved.

Online sign-up form

For online sign-up form https://webflow.com/ was used. This tool allows to create websistes without the need to write any code. The only piece of JavaScript that had to be written was about making an AJAX request to custom API that would pass form data.

"CRM" via OneDrive and Excel

All the accounts were managed via Excel files. One file per parner company. That kind of approach has many benefits out of the box. Let's mention a few:

  • Intuitive and flexible data management via Excel
  • Access management and sharing capabilities provided by OneDrive
  • Online collaboration and change tracking built-in

Azure Logic Apps - the glue

The core business logic was developed as custom service implemented in .NET Core and C#. This service also had its own database. Data edited in Excel files needed to be synced with the database in various ways:

  • changes made via Excel files needed to be reflected in central database
  • when data was modified by the business logic (for example status was changed and data generated as a result of the business flow), changes needed to be reflected back in the Excel to have a consistent view
  • when a new account was registered in the system, new Excel file to manage it was automatically created in OneDrive

All of that use cases were implemented via Azure Logic Apps. Logic App is composed from a ready to use building blocks. Here's the example of single execution log of an example Logic App:

In this case, any time an Excel file is modified in OneDrive, a request is made to custom API to uplaod the file for processing the updates. Before the request, an access token is obtained. Processed file is saved for audit, and in case of error an email alert is sent.

Unther the hood Logic App is defined as a JSON file, so its definition can be stored in the code repository and deployed to Azure via ARM.

Power BI to provide data insights

Reporting was the ultimate goal of that project. Business needed to know about the performance of particular sales agents and internal employees for things like commission reporting and follow-up calls.

When comparing to developing a custom reporting solution, Power BI is super easy to create the UI to browse, filter and export data. Once the connection with database is established, data model can be defined to create interesting visualistations with extensive filtering options. All that features available for 9,99$/month/user.

If you know SQL, and relational data modelling, but are new to Power BI, I can recommend this tutorial to get up to the speed with Power BI:

https://www.youtube.com/watch?v=AGrl-H87pRU

Summary

Thanks to low-code and no-code tools like Azure Logic Apps, Power BI or Webflow, it was possible to deliver end-to-end solution that users were able to interact with, without any custom code to build UI. If that project included also UI and related backend developent to support UI, it would take a few times more to provide similar capabilities. We could imagine simple UI with less effort but it would be not even close to the rich capabilities provided by Power BI and Excel out of the box.

Happy low-coding! :)

.NET MAUI vs Xamarin.Forms

· 7 min read

I've been focusing recently on Xamarin and also following the updates on MAUI. MAUI was started as a fork of Xamarin.Forms and this is how it should be seen - as next version of Xamarin.Forms. There will be no version 6 of Forms (current is version 5). Instead, Xamarin.Forms apps will have to be upgraded to MAUI - Multi-platform App UI. Alternative to upgrading is staying on Xamarin.Forms 5, that will be supported only for 12 months after MAUI official release. So if we want to be on supported tech stack, then we need to start getting familiar with MAUI.

MAUI and also whole Xamarin framework will be part of .NET 6. It was initially planned to release MAUI already on November 2021 with the new .NET release. Now we know that production-quality release was postponed to Q2 2022. Instead of production-ready release, we will keep getting preview versions of MAUI. It also means that Xamarin.Forms will be supported longer (Q2 2022 + 12 months).

OK, but what changes we can expect with MAUI? Below the summary of key differences when compared to Xamarin Forms 5.

1. Single project experience

In Xamarin.Forms we need to have separate project(s) for each platform and also project(s) for shared code. In MAUI we have an option to work with single project only that can target multiple platforms:

In single project we can still have platform-specific directories (under "Platforms" directory) or even have platform specific code in single file by using pre-process directives:

This is not something that MAUI introduced, this is achieved thanks to so SDK-style projects which are available in .NET 5. Already in .NET 5 we can multi-target projects and instruct MS Build which files or directories should be target-specific.

Example of multiple target in SDK-style project in csproj:

Example of conditional compilation based on target:

So this is not a MAUI magic, it is just about using .NET 5 capabilities.

2. Central assets management

Consequence of consistent single-project experience is also the need to manage assets in a single project. MAUI accomplishes that for PNG and SVG images by doing compilation-time image resizing. We can still have platform-specific resources if needed, for example for other formats.

But again, it is not a revolutionary change. MAUI is just integrating ResizetizerNT as integrated part of the framework.

So this MAUI feature is another low-hanging fruit. It was possible to achieve that also with Forms, but now you do not have to add additional libraries.

3. New decoupled architecture

New architecture seems to be the change where most of the efforts of MAUI team goes. This change is actually significant. It is a big refactoring of Forms driven by new architecture called Slim Renderers.

Slim renderers was kind of temporary name, so let's not get used to that. The term that we should remember and get familiar with is Handler. Role of Renderers from Xamarin.Forms is taken by Handlers in MAUI.

What's the main difference? We can summarise it with one world: decoupling. This is how it looks in Xamarin.Forms:

Renderers, that produce native views tree depend on Forms controls. In the above diagram you can see example for Entry control on iOS and Android platform but the same idea applies for other controls (like Button, Label, Grid etc) and other platforms (Windows, MacOS).

MAUI introduces new abstraction layer of controls interfaces. Handlers depend only on interfaces, but not on the implementation of UI controls:

This approach allow to decouple the responsibility of rendering platform-specific code that is handled by handlers from the implementation of the UI framework. MAUI is split into Microsoft.Maui.Core namespace and Microsoft.Maui.Controls namespace:

What is important to notice, is that support for XAML or bindings implementation was also decoupled from handling platform specific code. It makes the architecture much more open for crating alternative UI frameworks based on Maui.Core, but using different paradigms. We can already see experimental framework Comet using that approach and proposing MVU pattern instead of MVVM:

There is also a rumour around MAUI project that Fabulous framework could follow that path, but an interesting thing is that Fabulous team does not seem to share the enthusiasm ;) It will be interesting to see how the idea of supporting F# in MAUI will evolve.

But it is important to notice that MAUI does not have built-in MVU support. MAUI Controls are designed to support MVVM pattern that we know from Xamarin.Forms. What is changing, is the open architecture enabling alternative approaches, but alternatives are not built-in into MAUI.

4. Mappers

Ok, so there is no Renderers, there are Handlers. So how to introduce custom rendering when needed? Since there is no custom renderers, can we still have custom rendering? Yes, but we need to get familiar with one more new term in MAUI: Mappers.

Mapper is a public static dictionary-like data type exposed by Handlers:

It maps each of the properties defined on controls interface into a platform-specific handler function to render given property.

If we need custom behaviour we can just map our own function from our application:

See this repository created by Javier Suárez with examples and detailed explanations on how to migrate from Forms to MAUI: https://github.com/jsuarezruiz/xamarin-forms-to-net-maui

And do not worry about your existing custom renderers, they will still work thanks to MAUI compatibility package. Although it is recommended to migrate them into handlers, to get the benefits of improved performance.

5. Performance improvements

The goal of new architecture is not only to decouple layers, but also to improve performance. Handlers are more lightweight compared to renderers, as each property is handled in a separate handler instead of having one big rendering function to update the whole component.

MAUI also avoids assembly scanning at startup to find custom renderers. Custom handlers for your custom controls are registered in an explicit way on app Startup:

One more performance improvement should be reduced view nesting. Forms have the concept of fast renderers, in MAUI all handlers should be by design "fast".

But MAUI release was not postponed without any reasons. The team is still working on performance improvements. First benchmarks are showing that MAUI apps start even slower than Forms apps, see this issue for details: https://github.com/dotnet/maui/issues/822. In this case observed difference is not dramatic, it is 100ms, but still, we should not take for granted that MAUI is already faster.

6. BlazorWebView

Do you like to use Blazor for web UIs? Great news, with MAUI you will be able to use Blazor components also on all the platforms supported by MAUI (Android, iOS, Windows, MacOS). Components will render locally into HTML. HTML UI will run in web view, but it will avoid web assembly or SingalR, so we can expect relatively good performance.

And what is most important, Blazor components will be able to access native APIs from code behind! In think this is a great feature that opens scenarios for interesting hybrid architecture (combining native and web stack in a single app).

See this video for details: Introduction to .NET MAUI Blazor | The Xamarin Show

7. C# Hot Reload

And last but not least, since MAUI will be part of .NET 6 it will get also all other benefits that are coming with .NET 6. And one of them is hot reload for C# code. In Forms we have hot reload only for XAML, so it is a great productivity improvement, especially for UI development.

Summary

MAUI introduces significant changes but this framework can be still considered as evolution of Forms. Fingers crossed that it will reach Forms 5 stability and will make the framework even better thanks to the above improvements.

A note about complexity

· 3 min read

I came across this very interesting reading about complexity: https://iveybusinessjournal.com/publication/coping-with-complexity/

What was the most useful advice that I've found there was the idea of improvisation. So far in the context of professional work I had rather negative connotation of improvisation. I've seen it as a way of hiding lack of preparation or knowledge that was rather degrading expected quality.

But I was wrong. Improvisation turns out to be a great tool that anyone can start using relatively easy to deal with complexity. The inspiration is taken from theater improvisation and music jam sessions where the play is based on "yes, and..." rule.

Basically the rule says that whoever enters the show, has to build his part based on what others have already said or played. Not trying to negate or start something irrelevant. Participants are expected to continue and extend the plot that was already created.

I find this rule very useful when working on the complex projects. I can recall many situations in projects when there seem to be so many options with so much uncertainty that it seemed impossible to progress in the right direction. Those situations can be unblocked by improvisation, where we are allowed to progress based on knowledge that is limited. And in VUCA world we all have limited understanding about any subject that is non-trivial. The key is to identify the minimum set of facts required to progress and create progress based on your own expertise on top of what was already created. The facts from which to start, are identified by the skill of listening to others, not focusing solely on your own part.

The rule of not negating others work is the key factor here. You are allowed to suggest turns into left or right but it should still be a progress of the same journey. We should not start from a completely new point on our own, as it creates even more VUCA.

By using this method we can progress even when we are not sure where to go (like in machine learning). We can use joint force to explore and move faster. While we move on, we will make mistakes but also crate chances for victories. Moving on is the key. Staying in palce paralyzed by hard decision is something that may kill the project. And negating or ignoring what was already said and done, does not create progress.

In VUCA world, being certain that we are on the optimal path is impossible. What is possible, is exploration. If we are focused and making every small step based on competent knowledge, then we can expect that partial results will be achieved on daily basis and eventually also bigger goals are very likely to be met. Probably in a way that was not initially expected.

What software architecture is about?

· 2 min read

Martin Fowler has assembled great material explaining what is software architecture: https://martinfowler.com/architecture/

My key takeaway from this reading, is that software architecture is about making sure that adding functionality to software will not become more and more expensive as the time goes.

In the other words, the development effort that is done today should also support future innovation. We often hear that motto in business to focus on strengths and build on top of that. One of the strengths that an organization has, may be its software. So to follow the motto in the software teams, it should be much easier to build new products and services on top of the existing codebase and infrastructure, rather than starting from the scratch. Existing codebase should be a competitive advantage, not a burden.

But is it always like that? Organizations in some cases come to the conclusion that it makes more sense to start a product from the scratch or rewrite existing software to support new functionalities. Does it mean that the old systems has wrong software architecture?

Yes and no. Old architecture could be great and efficient to support the business goals of the past, but may be not suitable in the context of new reality and new business goals. Sometimes there must be a brave decision made to switch to the new architecture, to be able to stay on top. Otherwise new competitors that do not have to carry the burden of historical decisions, may grow much faster in the new reality and eventually take over the market. Proper timing of such technological shifts may be crucial for organizations to perform well.

Changing environment does not have to mean the change of the business model or market, but may also mean availability of new technologies or organizational changes.

Nevertheless, those kind of radical architecture changes should happen as seldom as possible as there is always a huge cost and complexity behind such shifts. Architecture should aim to predict likely scenarios and aim to be open for changes. There is always at least a few options that are supported by current constraints. Options more open for change and extensibility should be always preferred if the cost is comparable.

Azure Service Bus with MassTransit

· 3 min read

Mass Transit project is a well known abstraction layer for .NET over most popular message brokers, covering providers like RabbitMQ or Azure Service Bus. It will not only configure underlying message broker via friendly API, but will also address issues like error handling or concurrency. It also introduces a clean implementation of saga pattern and couple of other patterns useful in distributed systems.

This article is a brief introduction to MassTransit for Azure Service Bus users.

Sending a message

When sending a message we must specify so called "send endpoint" name. Under the hood, send endpoint is Azure Service Bus queue name. When send method is called, queue is automatically created.

ISendEndpointProvider needs to be injected to call Send method. See producers docs for other ways on how to send a message.

Queue name can be specified by creating a mapping from message type to queue name: EndpointConvention.Map<IMyCommand>(new Uri("queue:my-command"));

Publishing an event

When publishing an event we do not have to specify any endpoint name to which we send the message. MassTransit by convention creates a topic corresponding to published massage full name (including namespace). So under the hood we have a concrete topic as we have concrete queue when sending a message, but in case of events we do not specify that topic explicitly. I find it a bit inconsistent, but I understand  the idea - conceptually on the abstraction level, publishing an event does not have a target receiver. It is up to subscribers to subscribe for it. Publisher just throws the event into the air.

To publish an endpoint we simply call Publish method on injected IPublishEndpoint

await _publishEndpoint.Publish<IMyEventDone>(new {
MyProperty = "some value"
});

Event subscribers

It is important to understand how topics and subscriptions work in Azure Service Bus. Each subscription is a queue. When topic does not have any subscriptions, then events published to this topic are simply lost. This is a by-design behaviour in pub/sub pattern.

Consumes connect to subscriptions to process the messages. If there is no active consumers, then messages will be left in the subscription queue until there is a consumer or until a timeout. Subscription is a persistent entity, but consumers are dynamic processes. There may be multiple competing consumers for given subscription to scale-out message processing. Usually different subscriptions are created by different services/sub-systems interested in given events.

cfg.SubscriptionEndpoint<IMyEventDone>("my-subscription", c => {
c.ConfigureConsumer<Consumer>(context);
});

Worth to mention that if we use MassTransit and we subscribe to a subscription endpoint but we will not register any consumers, then messages sent to this endpoint will be automatically moved to _skipped queue created for IMyEventDone type.

There is also an alternative way of creating subscriptions, which will use additional queue that will have messages from the subscription auto-forwarded, see docs for details.

Anonymous types for messages

It is recommended by MassTransit author to use interfaces for massage contracts. MassTransit comes with a useful Roslyn analysers package which simplifies using anonymous types as interface implementations.  After installing analyzers: Install-Package MassTransit.Analyzers we can automatically add missing properties with Alt+Enter:

Blazor Web Assembly large app size

· 2 min read

There is a gotcha when crating Blazor application from the template that includes service worker (PWA support enabled). Notice that the service worker is pre-fetching all the static assets (files like JavaScript, CSS, images). See this code on GitHub as an example of the code generated by the default dotnet template for web assembly with PWA.

If you are using a bootstrap like https://adminlte.io/ then web assets folder will include all the plugins that the bootstrap comes with. You may use just 1% of the bootstrap functionality, but default service worker will pre-fetch all unused assets.

That can easily make initial size of the app loaded in the browser be around 40 MB (in the release mode). Do not think that all that files are necessary .net framework libraries loaded as web assembly. When looking into the network tab you'll notice that most of the resources are js/css files that you had probably never used in your app.

Normal web assembly application size created with Blazor, even on top of the bootstrap or components library like https://blazorise.com/ (or both), should be less than 2 MB of network resources (excluding images that you may have).

So, please watch out for the default PWA service worker. It can make initial app loading time unnecessary long. If you are not providing any offline functionality, the easiest solution is just to remove service worker from your project by removing it from index.html file. Another option is to craft includes path pattern to include only what is really used.