1
1
submitted 1 year ago by [email protected] to c/[email protected]

Today we are announcing the rename of Amazon Kinesis Data Analytics to Amazon Managed Service for Apache Flink, a fully managed and serverless service for you to build and run real-time streaming applications using Apache Flink.

We continue to deliver the same experience in your Flink applications without any impact on ongoing operations, developments, or business use cases. All your existing running applications in Kinesis Data Analytics will work as is without any changes.

Many customers use Apache Flink for data processing, including support for diverse use cases with a vibrant open-source community. While Apache Flink applications are robust and popular, they can be difficult to manage because they require scaling and coordination of parallel compute or container resources. With the explosion of data volumes, data types, and data sources, customers need an easier way to access, process, secure, and analyze their data to gain faster and deeper insights without compromising on performance and costs.

Using Amazon Managed Service for Apache Flink, you can set up and integrate data sources or destinations with minimal code, process data continuously with sub-second latencies from hundreds of data sources like Amazon Kinesis Data Streams and Amazon Managed Streaming for Apache Kafka (Amazon MSK), and respond to events in real-time. You can also analyze streaming data interactively with notebooks in just a few clicks with Amazon Managed Service for Apache Flink Studio with built-in visualizations powered by Apache Zeppelin.

With Amazon Managed Service for Apache Flink, you can deploy secure, compliant, and highly available applications. There are no servers and clusters to manage, no compute and storage infrastructure to set up, and you only pay for the resources your applications consume.

A History to Support Apache Flink Since we launched Amazon Kinesis Data Analytics based on a proprietary SQL engine in 2016, we learned that SQL alone was not sufficient to provide the capabilities that customers needed for efficient stateful stream processing. So, we started investing in Apache Flink, a popular open-source framework and engine for processing real-time data streams.

In 2018, we provided support for Amazon Kinesis Data Analytics for Java as a programmable option for customers to build streaming applications using Apache Flink libraries and choose their own integrated development environment (IDE) to build their applications. In 2020, we repositioned Amazon Kinesis Data Analytics for Java to Amazon Kinesis Data Analytics for Apache Flink to emphasize our continued support for Apache Flink. In 2021, we launched Kinesis Data Analytics Studio (now, Amazon Managed Service for Apache Flink Studio) with a simple, familiar notebook interface for rapid development powered by Apache Zeppelin and using Apache Flink as the processing engine.

Since 2019, we have worked more closely with the Apache Flink community, increasing code contributions in the area of AWS connectors for Apache Flink such as those for Kinesis Data Streams and Kinesis Data Firehose, as well as sponsoring annual Flink Forward events. Recently, we contributed Async Sink to the Flink 1.15 release, which improved cloud interoperability and added more sink connectors and formats, among other updates.

Beyond connectors, we continue to work with the Flink community to contribute availability improvements and deployment options. To learn more, see Making it Easier to Build Connectors with Apache Flink: Introducing the Async Sink in the AWS Open Source Blog.

New Features in Amazon Managed Service for Apache Flink As I mentioned, you can continue to run your existing Flink applications in Kinesis Data Analytics (now Amazon Managed Apache Flink) without making any changes. I want to let you know about a part of the service along with the console change and new feature,  a blueprint where you create an end-to-end data pipeline with just one click.

First, you can use the new console of Amazon Managed Service for Apache Flink directly under the Analytics section in AWS. To get started, you can easily create Streaming applications or Studio notebooks in the new console, with the same experience as before.

To create a streaming application in the new console, choose Create from scratch or Use a blueprint. With a new blueprint option, you can create and set up all the resources that you need to get started in a single step using AWS CloudFormation.

The blueprint is a curated collection of Apache Flink applications. The first of these has demo data being read from a Kinesis Data Stream and written to an Amazon Simple Storage Service (Amazon S3) bucket.

After creating the demo application, you can configure, run, and open the Apache Flink dashboard to monitor your Flink application’s health with the same experiences as before. You can change a code sample in the GitHub repository to perform different operations using the Flink libraries in your own local development environment.

Blueprints are designed to be extensible, and you can leverage them to create more complex applications to solve your business challenges based on Amazon Managed Service for Apache Flink. Learn more about how to use Apache Flink libraries in the AWS documentation.

You can also use a blueprint to create your Studio notebook using Apache Zeppelin as a new setup option. With this new blueprint option, you can also create and set up all the resources that you need to get started in a single step using AWS CloudFormation.

This blueprint includes Apache Flink applications with demo data being sent to an Amazon MSK topic and read in Managed Service for Apache Flink. With an Apache Zeppelin notebook, you can view, query, and analyze your streaming data. Deploying the blueprint and setting up the Studio notebook takes about ten minutes. Go get a cup of coffee while we set it up!

After creating the new Studio notebook, you can open an Apache Zeppelin notebook to run SQL queries in your note with the same experiences as before. You can view a code sample in the GitHub repository to learn more about how to use Apache Flink libraries.

You can run more SQL queries on this demo data such as user-defined functions, tumbling and hopping windows, Top-N queries, and delivering data to an S3 bucket for streaming.

You can also use Java, Python, or Scala to power up your SQL queries and deploy your note as a continuously running application, as shown in the blog posts, how to use the Studio notebook and query your Amazon MSK topics.

To learn more blueprint samples, see GitHub repositories such as reading from MSK Serverless and writing to Amazon S3, reading from MSK Serverless and writing to MSK Serverless, and reading from MSK Serverless and writing to Amazon S3.

Now Available You can now use Amazon Managed Service for Apache Flink, renamed from Amazon Kinesis Data Analytics. All your existing running applications in Kinesis Data Analytics will work as is without any changes.

To learn more, visit the new product page and developer guide. You can send feedback to AWS re:Post for Amazon Managed Service for Apache Flink, or through your usual AWS Support contacts.

— Channy

2
1
submitted 1 year ago by [email protected] to c/[email protected]
  1. Overview

In this tutorial, we’ll look at different ways to read JSON documents as Maps and compare them. We’ll also look at ways to find the differences between the two Maps. 2. Converting to Map

First, we’ll look at different ways to convert JSON documents to Maps. Let’s look at the JSON objects we’ll use for our test. Let’s create a file named first.json with the following content: { "name": "John", "age": 30, "cars": [ "Ford", "BMW" ], "address": { "street": "Second Street", "city": "New York" }, "children": [ { "name": "Sara", "age": 5 }, { "name": "Alex", "age": 3 } ] }

Similarly, let’s create another file named second.json with the following content: { "name": "John", "age": 30, "cars": [ "Ford", "Audi" ], "address": { "street": "Main Street", "city": "New York" }, "children": [ { "name": "Peter", "age": 5 }, { "name": "Cathy", "age": 10 } ] }

As we can see, there are two differences between the above JSON documents: The value of the cars array is different The value of the street key in the address object is different The children arrays have multiple differences 2.1. Using Jackson

Jackson is a popular library used for JSON operations. We can use Jackson to convert a JSON to a Map. Let’s start by adding the Jackson dependency:

com.fasterxml.jackson.core
jackson-databind
2.15.2

Now we can convert a JSON document to Map using Jackson: class JsonUtils { public static Map jsonFileToMap(String path) throws IOException { ObjectMapper mapper = new ObjectMapper(); return mapper.readValue(new File(path), new TypeReference>() {}); } }

Here, we’re using the readValue() method from the ObjectMapper class to convert the JSON document to a Map. It takes the JSON document as a File object and a TypeReference object as parameters. 2.2. Using Gson

Similarly, we can also use Gson to convert the JSON document to a Map. We need to include the dependency for this:

com.google.code.gson
gson
2.10.1

Now let’s look at the code to convert the JSON: public static Map jsonFileToMapGson(String path) throws IOException { Gson gson = new Gson(); return gson.fromJson(new FileReader(path), new TypeToken>() {}.getType()); }

Here, we’re using the fromJson() method from the Gson class to convert the JSON document to a Map. It takes the JSON document as a FileReader object and a TypeToken object as parameters. 3. Comparing Maps

Now that we’ve converted the JSON documents to Maps, let’s look at different ways to compare them. 3.1. Using Guava’s Map.difference()

Guava provides a Maps.difference() method that can be used to compare two Maps. To utilize this, let’s add the Guava dependency to our project:

com.google.guava
guava
32.1.2-jre

Now, let’s look at the code to compare the Maps: @Test void givenTwoJsonFiles_whenCompared_thenTheyAreDifferent() throws IOException { Map firstMap = JsonUtils.jsonFileToMap("src/test/resources/first.json"); Map secondMap = JsonUtils.jsonFileToMap("src/test/resources/second.json"); MapDifference difference = Maps.difference(firstFlatMap, secondFlatMap); difference.entriesDiffering().forEach((key, value) -> { System.out.println(key + ": " + value.leftValue() + " - " + value.rightValue()); }); assertThat(difference.areEqual()).isFalse(); }

Guava can only compare one level of Maps. This doesn’t work for our case as we have a nested Map. Let’s look at how we compare our nested Maps above. We’re using the entriesDiffering() method to get the differences between the Maps. This returns a Map of differences where the key is the path to the value, and the value is a MapDifference.ValueDifference object. This object contains the values from both the Maps. If we run the test, we’ll see the keys that are different between the Maps and their values: cars: [Ford, BMW] - [Ford, Audi] address: {street=Second Street, city=New York} - {street=Main Street, city=New York} children: [{name=Sara, age=5}, {name=Alex, age=3}] - [{name=Peter, age=5}, {name=Cathy, age=10}] As we can see, this shows that the cars, address, and children fields are different, and the differences are listed. However, this doesn’t show which nested fields are leading to these differences. For example, it doesn’t point out that the street field in the address objects is different. 3.2. Flattening Maps

To precisely point out differences between nested Maps, we’ll flatten the Maps so each key is a path to the value. For example, the street key in the address object will be flattened to address.street and so on. Let’s look at the code for this: class FlattenUtils { public static Map flatten(Map map) { return flatten(map, null); } private static Map flatten(Map map, String prefix) { Map flatMap = new HashMap<>(); map.forEach((key, value) -> { String newKey = prefix != null ? prefix + "." + key : key; if (value instanceof Map) { flatMap.putAll(flatten((Map) value, newKey)); } else if (value instanceof List) { // check for list of primitives Object element = ((List) value).get(0); if (element instanceof String || element instanceof Number || element instanceof Boolean) { flatMap.put(newKey, value); } else { // check for list of objects List> list = (List>) value; for (int i = 0; i < list.size(); i++) { flatMap.putAll(flatten(list.get(i), newKey + "[" + i + "]")); } } } else { flatMap.put(newKey, value); } }); return flatMap; } }

Here, we’re using recursion to flatten the Map. For any field, one of the following conditions will be true: The value could be a Map (nested JSON object). In this case, we’ll recursively call the flatten() method with the value as the parameter. For example, the address object will be flattened to address.street and address.city. Next, we can check if the value is a List (JSON array). If the list contains primitive values, we’ll add the key and value to the flattened Map. If the list contains objects, we’ll recursively call the flatten() method with each object as the parameter. For example, the children array will be flattened to children[0].name, children[0].age, children[1].name, and children[1].age. If the value is neither a Map nor a List, we’ll add the key and value to the flattened Map. This will be recursive until we reach the last level of the Map. At this point, we’ll have a flattened Map with each key as a path to the value. 3.3. Testing

Now that we’ve flattened the Maps, let’s look at how we can compare them using Maps.difference(): @Test void givenTwoJsonFiles_whenCompared_thenTheyAreDifferent() throws IOException { Map firstFlatMap = FlattenUtils.flatten(JsonUtils.jsonFileToMap("src/test/resources/first.json")); Map secondFlatMap = FlattenUtils.flatten(JsonUtils.jsonFileToMap("src/test/resources/second.json")); MapDifference difference = Maps.difference(firstFlatMap, secondFlatMap); difference.entriesDiffering().forEach((key, value) -> { System.out.println(key + ": " + value.leftValue() + " - " + value.rightValue()); }); assertThat(difference.areEqual()).isFalse(); }

Again, we’ll print the keys and values that are different. This leads to the output below: cars: [Ford, BMW] - [Ford, Audi] children[1].age: 3 - 10 children[1].name: Alex - Cathy address.street: Second Street - Main Street children[0].name: Sara - Peter

  1. Conclusion

In this article, we looked at comparing two JSON documents in Java. We looked at different ways to convert the JSON documents to Maps and then compared them using Guava’s Maps.difference() method. We also looked at how we can flatten the Maps so that we can compare nested Maps. As always, the code for this article is available over on GitHub.

3
1
submitted 1 year ago by [email protected] to c/[email protected]

At Google Cloud Next 2023, thousands of people gathered in San Francisco to learn about the newest updates from Google Cloud.

Stylized version of the word Next in bright colors in front of a black background

4
1
submitted 1 year ago by [email protected] to c/[email protected]

Virtual Threads are one of the most anticipated and exciting new feature of the JDK 21. They are a new model of threads, much lighter than the traditional platform threads. Virtual Threads are there to address the complexity and maintenance costs of asynchronous programming, without giving up on the performances this model gives. With virtual threads, you can get the best throughput performances with simple imperative blocking code. This episode tells you all about it: it shows you how you can use them, and how Virtual Threads are working under the hood. Make sure to check the show resources.

5
1
submitted 1 year ago by [email protected] to c/[email protected]
  1. Overview

While using an if statement, we might need multiple conditions in it with logical operators such as AND or OR. This may not be a clean design and affects the code's readability and cognitive complexity. In this tutorial, we'll see the alternatives to format multiple value conditions in an if statement. 2. Can We Avoid If Statements?

Suppose we have an e-commerce platform and set up a discount for people born in specific months. Let's have a look at a code snippet: if (month == 10 || month == 11) { // doSomething() } else if(month == 4 || month == 5) { // doSomething2() } else { // doSomething3() } This might lead to some code that is difficult to read. Also, even if we have good test coverage, the code may be hard to maintain as it forces us, for example, to put together what different conditions are to execute a specific action. 2.1. Use Clean Code

We can apply patterns to replace many if statements. For example, we can move the logic of an if's multiple conditions into a class or an Enum. At runtime, we'll switch between interfaces based on the client input. Similarly, we can have a look at the Strategy pattern. This does not strictly relate to formatting and usually leads to rethinking the logic. Nonetheless, it is something we can consider to improve our design. 2.2. Improve Methods Syntax

However, there is nothing wrong with using the if / else logic as long as the code is readable and easy to maintain. For example, let's consider this code snippet: if (month == 8 || month == 9) { return doSomething(); } else { return doSomethingElse(); } As a first step, we can avoid using the else part: if (month == 8 || month == 9) { return doSomething(); } return doSomethingElse(); Also, some other code can be improved, for example, by replacing month numbers with Enum of java.time package: if (month == OCTOBER || month == NOVEMBER || month == DECEMBER) { return doSomething(); } // ... These are simple yet effective code improvements. So, before applying complex patterns, we first should see if we can ease the code readability. We'll also see how to use functional programming. In Java, this applies from version 8 with the lambda expression syntax. 3. Tests Legend

Following the e-commerce discount example, we'll create tests and check values in the discount months. For instance, from October to December. We'll assert false otherwise. We'll set random months that are either in or out of the ones allowed: Month monthIn() { return Month.of(rand.ints(10, 13) .findFirst() .orElse(10)); } Month monthNotIn() { return Month.of(rand.ints(1, 10) .findFirst() .orElse(1)); } There could be multiple if conditions, although, for simplicity, we'll assume just one if / else statement. 4. Use Switch

An alternative to using an if logic is the switch command. Let's see how we can use it in our example: boolean switchMonth(Month month) { switch (month) { case OCTOBER: case NOVEMBER: case DECEMBER: return true; default: return false; } } Notice it will be moving down and checking all valid months if required. Furthermore, we can improve this with the new switch syntax from Java 12: return switch (month) { case OCTOBER, NOVEMBER, DECEMBER -> true; default -> false; }; Finally, we can do some testing to validate values in or not in the range: assertTrue(switchMonth(monthIn())); assertFalse(switchMonth(monthNotIn())); 5. Use Collections

We can use a collection to group what satisfies the if condition and check if a value belongs to it: Set months = Set.of(OCTOBER, NOVEMBER, DECEMBER);

Let's add some logic to see if the set contains a specific value: boolean contains(Month month) { if (months.contains(month)) { return true; } return false; } Likewise, we can add some unit tests: assertTrue(contains(monthIn())); assertFalse(contains(monthNotIn()));

  1. Use Functional Programming

We can use functional programming to convert the if / else logic to a function. Following this approach, we will have a predictable usage of our method syntax. 6.1. Simple Predicate

Let's still use the contains() method. However, we make it a lambda expression using a Predicate this time: Predicate collectionPredicate = this::contains;

We are now sure that the Predicate is immutable with no intermediate variables. Its outcome is predictable and reusable in other contexts if we need to. Let's check it out using the test() method: assertTrue(collectionPredicate.test(monthIn())); assertFalse(collectionPredicate.test(monthNotIn())); 6.2. Predicate Chain

We can chain multiple Predicate adding our logic in an or condition: Predicate orPredicate() { Predicate predicate = x -> x == OCTOBER; Predicate predicate1 = x -> x == NOVEMBER; Predicate predicate2 = x -> x == DECEMBER; return predicate.or(predicate1).or(predicate2); }

We can then plug it in the if: boolean predicateWithIf(Month month) { if (orPredicate().test(month)) { return true; } return false; } Let's check this is working with a test: assertTrue(predicateWithIf(monthIn())); assertFalse(predicateWithIf(monthNotIn())); 6.3. Predicate in Streams

Similarly, we can use a Predicate in a Stream filter. Likewise, a lambda expression in a filter will replace and enhance the if logic. The if will eventually disappear. This is an advantage of functional programming while still keeping good performance and readability. Let's test this while parsing an input list of months: List monthList = List.of(monthIn(), monthIn(), monthNotIn()); monthList.stream() .filter(this::contains) .forEach(m -> assertThat(m, is(in(months)))); We could also use the predicateWithIf() instead of the contains(). A lambda has no restrictions if it supports the method signature. For instance, it could be a static method. 7. Conclusion

In this tutorial, we learned how to improve the readability of multiple conditions in an if statement. We saw how to use a switch instead. Furthermore, we've also seen how to use a collection to check whether it contains a value. Finally, we saw how to adopt a functional approach using lambda expression. Predicate and Stream are less error-prone and will enhance code readability and maintenance. As always, the code presented in this article is available over on GitHub.

6
1
Amazon SNS Vs. Amazon SQS (feeds.feedblitz.com)
submitted 1 year ago by [email protected] to c/[email protected]
  1. Introduction

In this tutorial, we'll talk about two of the top services that AWS provides to users: SNS and SQS. First of all, we'll have a short description of one and the other, peeking at some simple use cases. Then we'll point out the main differences between them, looking from different angles. Finally, we'll see how those services express great power and capabilities when coupled together. 2. SNS Definition and Use Cases

Users utilize Amazon Simple Notification Service as a managed service for sending real-time notifications. To understand SNS easily, we can focus specifically on three objects: the topic, the publisher, and the subscriber. One topic can receive messages from multiple publishers and can deliver the same message to multiple subscribers. Every message a publisher sends to the topic will reach all subscribers registered:

Let's talk about the journey of a message from a publisher to a subscriber. First of all, both publishers and subscribers need to have permission to read and write from the SNS topic. We can define permissions with IAM Access Policies. Then, when the message reaches the topic, it's stored in an encrypted-at-rest storage that is used to retry the delivery of the message if something fails. Looking at the image, we can observe that there are two types of topics: Standard and FIFO. The key differentiation lies in the fact that only the latter ensures that the sequence of message delivery matches the order of their publication. Prior to dispatching messages, they undergo a procedure that involves the activation of Data Protection Policies. These policies are crafted to furnish an elevated level of safeguarding for Personally Identifiable Information (PII) and other forms of sensitive data. Techniques such as data masking are implemented as integral components of this protective process. Finally, subscribers can receive the message using various protocols. It's also possible to define specific filtering policies for every subscriber so that some messages are discarded and not sent, and a Dead Letter Queue (SQS queue) to handle delivery failures and manual retries for a specific subscriber. Users can deliver time-critical notifications using Amazon SNS. For example, with a monitoring tool such as Datadog, we can use SNS to send system alerts triggered by predefined thresholds in case something in our application is misbehaving. Another example could be to use SNS as a messaging system to send updates to every user subscribed to our application, using email, SMS, and mobile push notifications. SNS with SQS are the foundamentals building block of any AWS Cloud Based application for implementing the “Fanout” scenario. 3. SQS Definition and Use Cases

Amazon launched AWS SQS Simple Queue Service back in 2004 and it was one of the first managed services available to users. It has become one of the fundamental building blocks of many cloud-based applications. It finds its primary use in enabling asynchronous communication between different software components. We know from experience that managing a queue presents a series of challenges, such as the bounded-buffer problems, and that managing a distributed queue is even more difficult because of the communication that we need to enable between components to manage concurrent writes and reads. SQS helps solve all those problems in an easy way.

Let's now talk about a message's journey from a publisher to a subscriber. The first part is similar to SNS: publishers and subscribers need permission to read and write from the SQS queue. As in SQS, we can define those permissions with IAM Access Policies. Then, when the message reaches the queue, the system stores it in encrypted-at-rest storage. A delivery retry mechanism will read from this storage if something fails. The queue, in this case, can be Standard, FIFO, or Delay. We can pick a FIFO queue to maintain the ordering or a delay queue to delay the delivery of the message by a predefined amount of time. The consumer can read from the queue using two different mechanisms: short polling and long polling. Any use case in which we need a decoupling of communication between software components could make good use of the SQS service. For example, in the SNS infrastructure, SQS serves as the underlying implementation of the Dead Letter Queue, handling delivery failures and manual retries for a specific subscriber. 4. SNS Vs. SQS: Differences

Let's now summarize the key differences between SQS and SNS: Comparison SNS SQS

Entity Type Topic (standard and FIFO) Queue (standard, FIFO, and Delayed)

Message consumption Push Mechanism, SNS push messages to subscribers Pull Mechanism (Long and Short Polling)

Delivery Guarantee At least once Delivery Exactly once Delivery

Number of Subscribers Best suited for multiple subscriber use cases Best suited for single-subscriber use cases

Communication Type Real-Time, A2A and A2P Delayed communication, only A2A

  1. Why Coupling SNS and SQS Together?

The “fanout” scenario is the typical use case in which we need both SNS and SQS working together. In this case, messages are sent from SNS topics and then replicated to different SQS queues. The queues are subscribed to the service. This, of course, allows for parallel asynchronous processing. For example, let's suppose we need to build a video streaming platform. When a user uploads a new video, we publish an SNS message to a topic with the link to the item (stored in an S3 bucket). The topic establishes connections with several SQS queues. The queues will process (concurrently) different encoding and video quality for the same video. Then, a series of independent applications read from those queues and process the workload asynchronously:

The Dead-Letter Queue used to retry SNS message delivery is another example of using SNS and SQS together. In this way, clients and applications achieve real-time communication with better resiliency and fault tolerance. 6. Conclusion

In this article, we have described two of the most used software solutions of the AWS Cloud Provider: the Simple Queue Service and the Simple Notification Service. We discussed the key characteristics, we peeked at the basic functionalities, and we compared the most important features. Finally, we shared a common example of usage in which SNS and SQS combined solve some typical software design problems.

7
1
submitted 1 year ago by [email protected] to c/[email protected]
  1. Overview

Maven is a build automation and project management tool mainly used for Java-based applications. As a result, it automates software development lifecycles, making it easier for developers to manage dependencies, compile code, run tests, and package applications effectively. However, we may encounter an “Error in Opening Zip File” issue when using Maven commands to build an application. This issue often arises due to corrupted or inaccessible JAR files in the local Maven repository. In this tutorial, we'll explore various ways to resolve this issue. 2. Removing the Corrupted JAR File

Maven follows the “convention over configuration” principle to ensure a predefined project structure to avoid build and configuration errors. But, sometimes, we face an “Error in Opening Zip File” issue due to a corrupted JAR file. 2.1. Identify the Corrupted JAR

To resolve this issue, we must first identify the corrupted JAR file. In order to locate the corrupted JAR file, we need to check the build logs. These logs provide all the process details and the file's name. 2.2. Remove the JAR File

So far, we have already figured out a process to find the culprit JAR file. We now need to remove the JAR file from the local Maven repository. To demonstrate, let's assume we have junit-3.8.1.jar as a corrupted file: $ rm -rf /Home/ubuntu/.m2/repository/junit/junit/3.8.1/junit-3.8.1.jar By using the above command, the junit-3.8.1.jar file will be removed from the local Maven repository. 2.3. Rebuild the Project

The corrupted JAR file has already been removed from the local Maven repository. Let's rebuild the project using the mvn command: $ mvn clean install By running the above command, the project will be rebuilt, and Maven will search for the junit-3.8.1.jar dependency in the local repository. Since it won't be able to get it from the local repository, it will download it from the remote repository. 3. Clearing the Local Repository

If there are multiple JAR files leading to this issue, then we need to clean the entire local Maven repository. Therefore, we'll remove all the JAR files kept in the local Maven repository. By doing this, we ensure that we have the latest version of the dependencies and there are no conflicts due to multiple versions of the same JAR file. 3.1. Backup Local Repository

Before proceeding to remove the existing /.m2/repository/ directory. Moreover, we should first take a backup of this repository to prevent any data loss. As an outcome, it ensures that we have a copy of the required dependencies in our local repository. 3.2. Delete the Local Repository

By deleting the local Maven repository, all cached dependencies will be cleared. Since all the dependencies must be redownloaded again, this process can be time-consuming. To demonstrate, let's look at the command to clean the whole repository: $ rm -rf /.m2/repository/ The above command will simply remove all the dependencies present in the /.m2/repository/ directory. 3.3. Rebuild the Project

As we have already cleaned the entire repository, let's run the command to build the project again: $ mvn clean install Using this command, Maven will fetch all the dependencies from the remote repository and add them to the local repository. 4. Conclusion

In this article, we explored different ways to resolve the “Error in Opening Zip File” issue. First, we looked at removing the particular corrupted JAR file. After that, we resolved the issue by completely deleting the entire local Maven repository.

8
1
submitted 1 year ago by [email protected] to c/[email protected]
  1. Overview

Digital certificates are important in establishing trusted and secure online communication. We often use them to ensure the data exchanged between the client and the web server remains secure. In this tutorial, we’ll explore how to determine in Java whether a given certificate is self-signed or signed by a trusted Certificate Authority (CA). However, due to the diversity of certificates and security concepts, there is no one-size-fits-all solution. We often need to choose the best approach for our specific context and requirements. 2. Self-Signed Vs. CA-Signed Certificate

First, let’s examine the differences between self-signed and CA-signed certificates. Simply put, a self-signed certificate is generated and signed by the same entity. Even though it provides encryption, it doesn't verify trust by an independent authority. In other words, it doesn’t involve any third-party Certificate Authority (CA). Consequently, when a user's web browser encounters a self-signed certificate, it may issue a security warning since the certificate's authenticity can't be independently verified. We often use them in private networks and for testing purposes. On the other hand, CA-signed certificates are signed by trusted Certificate Authorities. The majority of web browsers and operating systems recognize and accept these CAs. Additionally, the CA-signed certificate proves the entity holding the certificate is the legitimate owner of the domain, helping users trust they're communicating with the genuine server and not an intermediary. Now, let’s see how to check whether we're dealing with a self-signed or CA-signed certificate using Java. 3. Checking if the Certificate Is Self-Signed

Before we start, let’s generate the certificate we’ll use throughout our examples. The easiest way to generate a self-signed certificate is with the keytool tool: keytool -genkey -keyalg RSA -alias selfsigned -keystore keystore.jks -validity 365 -keysize 2048 Here, we created a self-signed certificate with the selfsigned alias. Furthermore, we stored it inside the keysore.jks keystore. 3.1. Comparing Issuer and Subject Values

As previously mentioned, the same entity generates and signs a self-signed certificate. The issuer part of the certificate represents the certificate’s signer. The self-signed certificate has the same value for the subject (Issued To) and the issuer (Issued By). To put it differently, to determine whether we're dealing with a self-signed certificate, we'll compare its subject and issuer information. Furthermore, Java API provides the java.security.cert.X509Certificate class for dealing with certificates. With this class, we can interact with X.509 certificates and perform various checks and validations. Let’s check whether the subject and issuer match. We can achieve this by extracting the relevant fields from the X509Certificate object and checking if they match: @Test void whenCertificateIsSelfSigned_thenSubjectIsEqualToIssuer() throws Exception { X509Certificate certificate = (X509Certificate) keyStore.getCertificate("selfsigned"); assertEquals(certificate.getSubjectDN(), certificate.getIssuerDN()); } 3.2. Verifying the Signature

Another way we can check if we’re dealing with a self-signed certificate is to verify it using its own public key. Let's check the self-signed certificate using the verify() method: @Test void whenCertificateIsSelfSigned_thenItCanBeVerifiedWithItsOwnPublicKey() throws Exception { X509Certificate certificate = (X509Certificate) keyStore.getCertificate("selfsigned"); assertDoesNotThrow(() -> certificate.verify(certificate.getPublicKey())); } However, if we pass the CA-signed certificate, the method would throw an exception. 4. Checking if the Certificate Is CA-Signed

To qualify as a CA-signed, a certificate must be part of a chain of trust leading back to a trusted root CA. Simply put, a certificate chain contains a list of certificates starting from the root certificate and ending with the user's certificate. Each certificate in the chain signs the next certificate. When we talk about the chain of trust, there are different certificate types: Root Certificate Intermediate Certificate End Entity Certificate Furthermore, we use the root and intermediate certificates of the hierarchy to issue and verify end entity certificates. For the purposes of this tutorial, we’ll use the certificate obtained from the Baeldung site:

The complexity of checking CA-signed certificates increases if the end entity certificate is part of a certificate chain. In such a scenario, we might need to examine the entire chain to determine if we have a CA-signed certificate. The issuer of the certificate might not be the root CA directly but rather an intermediate CA that signed with the root CA. Now, if we examine the certificate hierarchy, we can see the certificate is part of a certificate chain:

Baltimore CyberTrust Root – Root CA Cloudflare Inc ECC CA-3 – Intermediate CA sni.cloudflaressl.com – End Entity (used on the Baeldung site) 4.1. Using Truststore

We can create our own truststore to check if one of the certificates we trust signed the end entity certificate. When setting up a truststore, we typically include the root CA certificates as well as any intermediate CA certificates required to build the chain of trust. This way, our application can effectively validate certificates presented by other parties. An advantage of using a truststore is the ability to decide which CA certificates we trust and which we don't. From our example, the Baltimore CyberTrust Root certificate signed the intermediate Cloudflare certificate, which signed our end entity certificate. Now, let's add both to our truststore: keytool -importcert -file cloudflare.cer -keystore truststore.jks -alias cloudflare keytool -importcert -file root.cer -keystore truststore.jks -alias root Next, to check if we trust the given end entity certificate, we need to find a way to get the root certificate. Let’s create the getRootCertificate() method that searches for a root certificate: X509Certificate getRootCertificate(X509Certificate endEntityCertificate, KeyStore trustStore) throws Exception { X509Certificate issuerCertificate = findIssuerCertificate(endEntityCertificate, trustStore); if (issuerCertificate != null) { if (isRoot(issuerCertificate)) { return issuerCertificate; } else { return getRootCertificate(issuerCertificate, trustStore); } } return null; } First, we attempt to locate the issuer of a provided certificate within the trust store: static X509Certificate findIssuerCertificate(X509Certificate certificate, KeyStore trustStore) throws KeyStoreException { Enumeration aliases = trustStore.aliases(); while (aliases.hasMoreElements()) { String alias = aliases.nextElement(); Certificate cert = trustStore.getCertificate(alias); if (cert instanceof X509Certificate) { X509Certificate x509Cert = (X509Certificate) cert; if (x509Cert.getSubjectX500Principal().equals(certificate.getIssuerX500Principal())) { return x509Cert; } } } return null; } Then, if a match is discovered, we proceed to verify whether the certificate is a self-signed CA certificate. In the event of successful verification, we reach the root certificate. If not, we continue our search. Finally, let's test our method to check whether it works properly: @Test void whenCertificateIsCASigned_thenRootCanBeFoundInTruststore() throws Exception { X509Certificate endEntityCertificate = (X509Certificate) keyStore.getCertificate("baeldung"); X509Certificate rootCertificate = getRootCertificate(endEntityCertificate, trustStore); assertNotNull(rootCertificate); } If we perform the same test using the self-signed certificate, we won't get the root since our trust store doesn’t contain it. 5. Checking if the Certificate Is a CA Certificate

In cases when we’re dealing only with root or intermediate certificates, we may need to perform additional checks. It’s important to note a root certificate is also a self-signed certificate. However, the difference between the root and the user’s self-signed certificate is the former will have the keyCertSign flag enabled (since we can use it to sign other certificates). We can identify root or intermediate certificates by checking the key usage: @Test void whenCertificateIsCA_thenItCanBeUsedToSignOtherCertificates() throws Exception { X509Certificate certificate = (X509Certificate) keyStore.getCertificate("cloudflare"); assertTrue(certificate.getKeyUsage()[5]); } Moreover, one of the checks we can perform is checking basic constraints extension. The Basic Constraints extension is a field in X.509 certificates that provides information about the certificate’s intended usage and whether it represents a Certificate Authority (CA) or an end entity. If the basic constraints extension doesn’t exist, the getBasicConstraints() method returns -1: @Test void whenCertificateIsCA_thenBasicConstrainsReturnsZeroOrGreaterThanZero() throws Exception { X509Certificate certificate = (X509Certificate) keyStore.getCertificate("cloudflare"); assertNotEquals(-1, certificate.getBasicConstraints()); } 6. Conclusion

In this article, we learned how to check whether a certificate is self-signed or CA-signed. To sum up, self-signed certificates have the same subject and issuer components, and additionally, they can be verified using their own public key. On the other hand, CA-signed certificates are usually part of the certificate chain. To validate them, we need to create a trust store that contains the trusted root and intermediate certificates and check if the root of the end entity certificate matches one of the trusted certificates. Finally, if we're working with root or intermediate certificates, we can identify them by checking whether they're used for signing other certificates or not. As always, the entire code examples can be found over on GitHub.

9
1
submitted 1 year ago by [email protected] to c/[email protected]
  1. Introduction

In this tutorial, we'll focus on the Cartesian product and how to get the Cartesian product of any number of sets in Java. The Cartesian product of multiple sets is a useful concept in Java when you need to generate all possible permutations and combinations of elements from those sets. This operation is commonly used in various scenarios, such as test data generation, database queries, and game development. 2. Cartesian Product

A Cartesian Product is a mathematical operation that combines the elements of multiple sets to create a new set, where each element in the new set is an ordered tuple containing one element from each input set. The size of the Cartesian product is equal to the product of the sizes of the input set. Let's understand this with the help of an example by using three sets: setA, setB, and setC. We'll calculate the Cartesian Product, and the resulting cartesianProduct set will contain all the ordered tuples representing the Cartesian Product of the three input sets: public void cartesianProduct() { Set setA = new HashSet<>(Arrays.asList(10,20)); Set setB = new HashSet<>(Arrays.asList("John","Jack")); Set setC = new HashSet<>(Arrays.asList('I','J'));

Set> cartesianProduct = new HashSet&lt;>();
for (int i: setA) {
    for (String j: setB) {
        for (char k: setC) {
            List tuple = Arrays.asList(i,j,k);
            cartesianProduct.add(tuple);
        }
    }
}

for (List tuple: cartesianProduct) {
    System.Out.Println(tuple);
}

} Here is the output of the above program: [10,John,I] [10,John,J] [10,Jack,I] [10,Jack,J] [20,John,I] [20,John,J] [20,Jack,I] [20,Jack,J] Now, let's look at the various approaches to calculating the Cartesian product of any number of sets. 3. Get Cartesian Product using Plain Java

In the following sections, we'll look at the various ways (iterative, recursive, and using Java8 streams) to generate the Cartesian Product. 3.1. Recursive Approach

We can use a recursive approach to compute the Cartesian Product of any number of sets in Java. The output is defined as List>, where each inner list can contain a mix of integers, strings, and characters. Here is a sample code to achieve this: public static void main(String args[]) { List> sets = new ArrayList<>(); sets.add(List.of(10, 20)); sets.add(List.of("John", "Jack")); sets.add(List.of('I', 'J')); List> cartesianProduct = getCartesianProduct(sets); System.out.println(cartesianProduct); } public static List> getCartesianProduct(List> sets) { List> result = new ArrayList<>(); getCartesianProductHelper(sets, 0, new ArrayList<>(), result); return result; } private static void getCartesianProductHelper(List> sets, int index, List current, List> result) { if (index == sets.size()) { result.add(new ArrayList<>(current)); return; } List currentSet = sets.get(index); for (Object element: currentSet) { current.add(element); getCartesianProductHelper(sets, index+1, current, result); current.remove(current.size() - 1); } } The output contains eight elements in the list: [[10,John,I] [10,John,J] [10,Jack,I] [10,Jack,J] [20,John,I] [20,John,J] [20,Jack,I] [20,Jack,J]] 3.2. Iterative Approach Using Bit Manipulation

In the following code, we're calculating the total number of possible combinations by using the bitwise shifting. For each specific combination, we're using the binary representation of the combination index. This index allows us to decide which elements from the set should be included in the current combination. These formed combinations are then accumulated within the result list and ultimately returned as Cartesian Product. Let's take a look at another approach where we try to generate the Cartesian Product using bit manipulation: public List> getCartesianProductIterative(List> sets) { List> result = new ArrayList<>(); if (sets == null || sets.isEmpty()) { return result; } int totalSets = sets.size(); int totalCombinations = 1 << totalSets; for (int i = 0; i < totalCombinations; i++) { List combination = new ArrayList<>(); for (int j = 0; j < totalSets; j++) { if (((i >> j) & 1) == 1) { combination.add(sets.get(j).get(0)); } else { combination.add(sets.get(j).get(1)); } } result.add(combination); } return result; } Here is the output of the above program: [20, Jack, J] [10, Jack, J] [20, John, J] [10, John, J] [20, Jack, I] [10, Jack, I] [20, John, I] [10, John, I] 3.3. Using Streams

We'll use Java 8 streams and recursive calls to generate the Cartesian Product. The cartesianProduct method will return a stream of all possible combinations. The base case is when the index reaches the size of the sets, and an empty list is returned to terminate the recursion. Let's use streams to generate the Cartesian Product: public List> getCartesianProductUsingStreams(List> sets) { return cartesianProduct(sets,0).collect(Collectors.toList()); } public Stream> cartesianProduct(List> sets, int index) { if (index == sets.size()) { List emptyList = new ArrayList<>(); return Stream.of(emptyList); } List currentSet = sets.get(index); return currentSet.stream().flatMap(element -> cartesianProduct(sets, index+1) .map(list -> { List newList = new ArrayList<>(list); newList.add(0, element); return newList; })); } Here is the output of the above program: [10,John,I] [10,John,J] [10,Jack,I] [10,Jack,J] [20,John,I] [20,John,J] [20,Jack,I] [20,Jack,J] 4. Get Cartesian Product using Guava

Guava, which is a popular library developed by Google, provides utilities to work with collections, including computing the Cartesian Product of multiple sets. To use Guava for computing the Cartesian Product, let's start by adding Google's Guava library dependency in pom.xml:

com.google.guava
guava
32.1.1-jre

The latest version of the dependency can be checked here. Now, we'll use Guava's Set.cartesianProduct() method, which takes a list of sets List> as an input and returns a set of lists Set> containing all the combinations of elements from the input sets. Finally, we transformed the set of lists into a list of lists and returned the output: public List> getCartesianProductUsingGuava(List> sets) { Set> cartesianProduct = Sets.cartesianProduct(sets); List> cartesianList = new ArrayList<>(cartesianProduct); return cartesianList; } The output contains eight elements in the list: [[10,John,I] [10,John,J] [10,Jack,I] [10,Jack,J] [20,John,I] [20,John,J] [20,Jack,I] [20,Jack,J]] 5. Conclusion

In this article, we focused on various ways to calculate the Cartesian Product of any number of sets in Java. While some of them were purely Java, others required additional libraries. Each method has its advantages, and users may prefer them based on the specific use case and performance requirements. The recursive approach is straightforward and easier to understand, while the iterative approach is generally more efficient for larger sets. The complete source code for these examples is available over on GitHub.

10
1
submitted 1 year ago by [email protected] to c/[email protected]

AI is helping us transform Google products and businesses — and it’s how we help Google Cloud customers transform theirs.

An abstract logo showing the word 'Next' in jumbled up letters in blue, red, yellow and green.

11
1
submitted 1 year ago by [email protected] to c/[email protected]

Presented by Tobi Ajila (IBM OpenJ9 Team) during the JVM Language Summit 2023 (Santa Clara CA). Check the JVMLS 2023 playlist for more videos.

12
1
Cloud Next 2023 (blog.google)
submitted 1 year ago by [email protected] to c/[email protected]

At Google Cloud Next 2023, we announced new updates to our products, including what's next in generative AI.

13
1
submitted 1 year ago by [email protected] to c/[email protected]

We’re introducing Google Cloud healthcare customers using Med-PaLM 2 and other generative AI solutions.

Medical themed graphic

14
1
submitted 1 year ago by [email protected] to c/[email protected]
15
1
submitted 1 year ago by [email protected] to c/[email protected]

Developer News Voting has begun for the Kubernetes Steering election; cast your ballot on the election site, which also tells you if you’re eligible or not. If you’re not, and should be, then request an exception. With 11 candidates for four seats, this will be a tough one, so give yourself some time. CVE-2023-3676, CVE-2023-3955, and CVE-2023-3893 were reported for Kubernetes on Windows, and are patched in the current update releases. These are high-risk security issues, and all Windows users should upgrade as soon as possible. The #kubernetes-contributors Slack channel has been split into #kubernetes-new-contributors and #kubernetes-org-members. The former will be the channel for introductions, getting started and mentorship requests, whereas org-members will be for established contributor communications. The SIG-Contribex mailing list will be migrated to a project-controlled Google Group on September 1. This is the first of many mailing list migrations. Han Kang started a discussion on replumbing Kubernetes for safer upgrades. Release Schedule Next Deadline: 1.29 Begins, September 5th We are in the interval between releases, but if you wanted to be part of the 1.29 release team, there is still time to apply. Patch releases 1.28.1, 1.27.5, 1.26.8, 1.25.13 came out last week. These include important security patches for Kubernetes on Windows. 1.24 is now EOL, and users of 1.24 need to upgrade or look at their ecosystem support options. Featured PR #119592: Add additional utilities to kubectl image The registry.k8s.io/kubectl container image is one of the release artifacts put out with every version of Kubernetes. Like our other images, it has been built as minimally as possible, using the distroless base image and only containing the kubectl binary and files required for it to run. While this minimalism makes sense for our daemon images, is the same true for a CLI tool? This PR swaps out the base image to a minimal debian and installs a suite of basic CLI support tools including bash, sed, awk, grep, diff, and jq. However concerns have been raised that the improved UX isn’t worth the greater risk to users due to those extra tools needing security updates, a task we aren’t well set up for. A revert has been proposed pending feedback from the relevant SIGs. If you have thoughts one way or the other about this change, now is the time to make them known! KEP of the Week KEP-4006: Transition from SPDY to WebSockets Currently the communication involving bi-directional streaming between Kubernetes clients and the API server is done through SPDY/3.1 protocol. These include several kubectl commands like kubectl exec, kubectl cp (built on top of kubectl exec primitives), kubectl port-forward and kubectl attach. This KEP transitions the bi-directional communication protocol used from SPDY to WebSockets, since SPDY was deprecated in 2015. WebSockets on the other hand is a standardized protocol and provides compatibility with software and programming languages. As of now, the bidirectional streaming is initiated from the Kubernetes clients, proxied by the API server and kubelet, and terminated in the container runtime. This KEP proposes to modify kubectl to request a WebSocket connection, and to modify the API server proxy to translate the kubectl WebSocket data stream into to a SPDY upstream connection. This way everything upstream the API server need not be changed in the initial implementation. This KEP is in alpha in v1.28. Other Merges onPodConditions is optional in Job FailurePolicy, not required; backported CEL replace() estimates the cost of ‘’ as correctly low More backfilling --image-repository in kubeadm commands PodSchedulingContext node lists are an atomic list Some nice new code docs around how x509 communication works inside Kubernetes Node taint manager reports APIversions Testing Updates: PV and PVC Endpoints Promotions API List Chunking to GA Deprecated v1beta3 API of KubeSchedulerConfiguration is deprecated and will be removed in 1.29 Version Updates CoreDNS to v1.11.1 CEL to 1.17.5 CNI plugins to v1.3.0 cri-tools to v1.28.0 Subprojects and Dependency Updates cloud-provider-azure adds node non-graceful shutdown feature by adding node.kubernetes.io/out-of-service taint when nodes are shutdown so that the pods can be forcefully deleted aws-ebs-csi-driver adds opentelemetry tracing of gRPC calls. The feature is currently behind a feature flag

16
1
submitted 1 year ago by [email protected] to c/[email protected]

This week, I will meet our customers and partners at the AWS Summit Mexico. If you are around, please come say hi at the community lounge and at the F1 Game Day where I will spend most of my time. I would love to discuss your developer experience on AWS and listen to your stories about building on AWS.

Last Week’s Launches I am amazed at how quickly service teams are deploying services to the new il-central-1 Region, aka AWS Israel (Tel-Aviv) Region. I counted no fewer than 25 new service announcements since we opened the Region on August 1, including ten just for last week!

In addition to these developments in the new Region, here are some launches that got my attention during the previous week.

AWS Dedicated Local Zones – Just like Local Zones, Dedicated Local Zones are a type of AWS infrastructure that is fully managed by AWS. Unlike Local Zones, they are built for exclusive use by you or your community and placed in a location or data center specified by you to help comply with regulatory requirements. I think about them as a portion of AWS infrastructure dedicated to my exclusive usage.

Enhanced search on AWS re:Post – AWS re:Post is a cloud knowledge service. The enhanced search experience helps you locate answers and discover articles more quickly. Search results are now presenting a consolidated view of all AWS knowledge on re:Post. The view shows AWS Knowledge Center articles, question and answers, and community articles that are relevant to the user’s search query.

Amazon QuickSight supports scheduled programmatic export to Microsoft Excel – Amazon QuickSight now supports scheduled generation of Excel workbooks by selecting multiple tables and pivot table visuals from any sheet of a dashboard. Snapshot Export APIs will now also support programmatic export to Excel format, in addition to Paginated PDF and CSV.

Amazon WorkSpaces announced a new client to support Ubuntu 20.04 and 22.04 – The new client, powered by WorkSpaces Streaming Protocol (WSP), improves the remote desktop experience by offering enhanced web conferencing functionality, better multi-monitor support, and a more user-friendly interface. To get started, simply download the new Linux client versions from Amazon WorkSpaces client download website.

Amazon Sagemaker CPU/GPU profiler – We launched the preview of Amazon SageMaker Profiler, an advanced observability tool for large deep learning workloads. With this new capability, you are able to access granular compute hardware-related profiling insights for optimizing model training performance.

Amazon Sagemaker rolling deployments strategy – You can now update your Amazon SageMaker Endpoints using a rolling deployment strategy. Rolling deployment makes it easier for you to update fully-scaled endpoints that are deployed on hundreds of popular accelerated compute instances.

For a full list of AWS announcements, be sure to keep an eye on the What's New at AWS page. Other AWS News Some other updates and news that you might have missed:

On-demand Container Loading in AWS Lambda – This one is not new from this week, but I spotted it while I was taking a few days of holidays. Marc Brooker and team were awarded Best Paper by USENIX Association for On-demand Container Loading in AWS Lambda (pdf). They explained in detail the challenges of loading (huge) container images in AWS Lambda. A must-read if you’re curious how Lambda functions work behind the scenes (pdf).

The Official AWS Podcast – Listen each week for updates on the latest AWS news and deep dives into exciting use cases. There are also official AWS podcasts in several languages. Check out the ones in French, German, Italian, and Spanish.

AWS Open Source News and Updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open source projects, posts, events, and more.

Upcoming AWS Events Check your calendars and sign up for these AWS events:

AWS Hybrid Cloud & Edge Day (August 30) – Join a free-to-attend one-day virtual event to hear the latest hybrid cloud and edge computing trends, emerging technologies, and learn best practices from AWS leaders, customers, and industry analysts. To learn more, see the detail agenda and register now.

AWS Global Summits – The 2023 AWS Summits season is almost ending with the last two in-person events in Mexico City (August 30) and Johannesburg (September 26).

AWS re:Invent – But don’t worry because re:Invent season (November 27–December 1) is coming closer. Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Registration is now open.

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: Aotearoa (September 6), Lebanon (September 9), Munich (September 14), Argentina (September 16), Spain (September 23), and Chile (September 30). Visit the landing page to check out all the upcoming AWS Community Days.

CDK Day (September 29) – A community-led fully virtual event with tracks in English and Spanish about CDK and related projects. Learn more at the website.

That’s all for this week. Check back next Monday for another Week in Review!

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

-- seb

17
1
submitted 1 year ago by [email protected] to c/[email protected]

Code reflection is a proposed enhancement to reflective programming in Java that enables standard access, analysis, and transformation of Java code. Code reflection is designed to address limitations in today’s Java platform when it is used to support specific programming domains where symbolic access to Java programs is required e.g., Java programs representing SQL statements, differentiable programs, machine learning models, or GPU kernels. Check the JVMLS 2023 playlist for more videos.

18
1
submitted 1 year ago by [email protected] to c/[email protected]
  1. Overview

Random selection of elements from a Set is a common requirement in various Java applications, especially in games and data processing tasks. In this article, we'll explore different methods to pick a random element from a Java Set. 2. Using the java.util.Random Class

The java.util.Random class is a handy tool for generating random numbers. To pick a random element from a Set, we can generate a random index and use it to access the element: public static T getByRandomClass(Set set) { if (set == null || set.isEmpty()) { throw new IllegalArgumentException("The Set cannot be empty."); } int randomIndex = new Random().nextInt(set.size()); int i = 0; for (T element : set) { if (i == randomIndex) { return element; } i++; } throw new IllegalStateException("Something went wrong while picking a random element."); } Let's test our method: Set animals = new HashSet<>(); animals.add("Lion"); animals.add("Elephant"); animals.add("Giraffe"); String randomAnimal = getByRandomClass(animals); System.out.println("Randomly picked animal: " + randomAnimal); The result should be random: Randomly picked animal: Giraffe 3. Using the ThreadLocalRandom Class

Starting from Java 7, the ThreadLocalRandom class provides a more efficient and thread-safe alternative for generating random numbers. Here's how we can use it to pick a random index from a Set: int randomIndex = ThreadLocalRandom.current().nextInt(set.size()); The solution is the same as above except for how the random number is selected. Using ThreadLocalRandom is preferable over java.util.Random because it reduces contention in multi-threaded scenarios and generally offers better performance. 4. Conclusion

In summary, we've learned two ways to pick a random element from a Java Set. The example code from this article can be found over on GitHub.

19
1
submitted 1 year ago by [email protected] to c/[email protected]

Java 21 is chock-full of great features and if you're coming all the way from 17, there's a plethora of additions to use and get used to, but it's all for naught if you can't actually update. In this #RoadTo21 episode, we discuss all you need to know to update from Java 17 to 21: API changes that may require you to update your code (like the introduction of sequenced collections or bug fixes in Double/Float::toString and IdentityHashMap), ongoing deprecations (threading, security manager, finalization, and more) and changes in networking (like earlier URL validation and HTTP timeouts), encoding (UTF-8 by default and changes in date/time/unit formatting), the runtime (like removed options class loading), and tooling (like new warnings). We'll also go beyond the nitty-gritty details and see the bigger picture of how to best prepare and execute your Java and 3rd party updates by talking about inside.java, release notes, Quality Outrach, and much more. Make sure to check the show notes.

20
1
submitted 1 year ago by [email protected] to c/[email protected]

This post was co-authored by Dan Russ, Associate Director, and Sacha Abinader, Managing Director from Accenture. The year 2022 was a notable one in the history of our climate—it stood as the fifth warmest year ever recorded1. An increase in extreme weather conditions, from devastating droughts and wildfires to relentless floods and heat waves, made their presence felt more than ever before—and 2023 seems poised to shatter still more records. These unnerving circumstances demonstrate the ever-growing impact of climate change that we’ve come to experience as the planet continues to warm. Microsoft’s sustainability journey At Microsoft, our approach to mitigating the climate crisis is rooted in both addressing the sustainability of our own operations and in empowering our customers and partners in their journey to net-zero emissions. In 2020, Microsoft set out with a robust commitment: to be a carbon-negative, water positive, and zero-waste company, while protecting ecosystems, all by the year 2030. Three years later, Microsoft remains steadfast in its resolve. As part of these efforts, Microsoft has launched Microsoft Cloud for Sustainability, a comprehensive suite of enterprise-grade sustainability management tools aimed at supporting businesses in their transition to net-zero. Moreover, our contribution to several global sustainability initiatives has the goal of benefiting every individual and organization on this planet. Microsoft has accelerated the availability of innovative climate technologies through our Climate Innovation Fund and is working hard to strengthen our climate policy agenda. Microsoft’s focus on sustainability-related efforts forms the backdrop for the topic tackled in this blog post: our partnership with Accenture on the application of AI technologies toward solving the challenging problem of methane emissions detection, quantification, and remediation in the energy industry. “We are excited to partner with Accenture to deliver methane emissions management capabilities. This combines Accenture’s deep domain knowledge together with Microsoft’s cloud platform and expertise in building AI solutions for industry problems. The result is a solution that solves real business problems and that also makes a positive climate impact.”—Matt Kerner, CVP Microsoft Cloud for Industry, Microsoft. Why is methane important? Methane is approximately 85 times more potent than carbon dioxide (CO2) at trapping heat in the atmosphere over a 20-year period. It is the second most abundant anthropogenic greenhouse gas after CO2, accounting for about 20 percent of global emissions. The global oil and gas industry is one of the primary sources of methane emissions. These emissions occur across the entire oil and gas value chain, from production and processing to transmission, storage, and distribution. The International Energy Agency (IEA) estimates that it is technically possible to avoid around 75 percent of today’s methane emissions from global oil and gas operations. These statistics drive home the importance of addressing this critical issue. Microsoft’s investment in Project Astra Microsoft has signed on to the Project Astra initiative—together with leading energy companies, public sector organizations, and academic institutions—in a coordinated effort to demonstrate a novel approach to detecting and measuring methane emissions from oil and gas production sites. Project Astra entails an innovative sensor network that harnesses advances in methane-sensing technologies, data sharing, and data analytics to provide near-continuous emissions monitoring of methane across oil and gas facilities. Once operational, this kind of smart digital network would allow producers and regulators to pinpoint methane releases for timely remediation. Accenture and Microsoft—The future of methane management Attaining the goal of net-zero methane emissions is becoming increasingly possible. The technologies needed to mitigate emissions are maturing rapidly, and digital platforms are being developed to integrate complex components. As referenced in Accenture’s recent methane thought leadership piece, “More than hot air with methane emissions”. What is needed now is a shift—from a reactive paradigm to a preventative one—where the critical issue of leak detection and remediation is transformed into leak prevention by leveraging advanced technologies. Accenture’s specific capabilities and toolkit To date, the energy industry’s approach to methane management has been fragmented and comprised of a host of costly monitoring tools and equipment that have been siloed across various operational entities. These siloed solutions have made it difficult for energy companies to accurately analyze emissions data, at scale, and remediate those problems quickly. What has been lacking is a single, affordable platform that can integrate these components into an effective methane emissions mitigation tool. These components include enhanced detection and measurement capabilities, machine learning for better decision-making, and modified operating procedures and equipment that make “net-zero methane” happen faster. These platforms are being developed now and can accommodate a wide variety of technology solutions that will form the digital core necessary to achieve a competitive advantage. Accenture has created a Methane Emissions Monitoring Platform (MEMP) that facilitates the integration of multiple data streams and embeds key methane insights into business operations to drive action (see Figure 1 below).

Figure 1: Accenture’s Methane Emissions Monitoring Platform (MEMP). The cloud-based platform, which runs on Microsoft Azure, enables energy companies to both measure baseline methane emissions in near real-time and detect leaks using satellites, fixed wing aircraft, and ground level sensing technologies. It is designed to integrate multiple data sources to optimize venting, flaring, and fugitive emissions. Figure 2 below illustrates the aspirational end-to-end process incorporating Microsoft technologies. MEMP also facilitates connectivity with back-end systems responsible for work order creation and management, including the scheduling and dispatching of field crews to remediate specific emission events.

Figure 2: The Methane Emissions Monitoring Platform Workflow (aspirational). Microsoft’s AI tools powering Accenture’s Methane Emissions Monitoring Platform Microsoft has provided a number of Azure-based AI tools for tackling methane emissions, including tools that support sensor placement optimization, digital twin for methane Internet of Things (IoT) sensors, anomaly (leak) detection, and emission source attribution and quantification. These tools, when integrated with Accenture’s MEMP, allow users to monitor alerts in near real-time through a user-friendly interface, as shown in Figure 3.

Figure 3: MEMP Landing Page visualizing wells, IoT sensors, and Work Orders. “Microsoft has developed differentiated AI capabilities for methane leak detection and remediation, and is excited to partner with Accenture in integrating these features onto their Methane Emissions Monitoring Platform, to deliver value to energy companies by empowering them in their path to net-zero emissions”—Merav Davidson, VP, Industry AI, Microsoft. Methane IoT sensor placement optimization Placing sensors in strategic locations to ensure maximum potential coverage of the field and timely detection of methane leaks is the first step towards building a reliable end-to-end IoT-based detection and quantification solution. Microsoft’s solution for sensor placement utilizes geospatial, meteorological, and historical leak rate data and an atmospheric dispersion model to model methane plumes from sources within the area of interest and obtain a consolidated view of emissions. It then selects the best locations for sensors using either a mathematical programming optimization method, a greedy approximation method, or an empirical downwind method that considers the dominant wind direction, subject to cost constraints. In addition, Microsoft provides a validation module to evaluate the performance of any candidate sensor placement strategy. Operators can evaluate the marginal gains offered by utilizing additional sensors in the network, through sensitivity analysis as shown in Figure 4 below.

Figure 4: Left: Increase in leak coverage with a number of sensors. By increasing the number of sensors that are available for deployment, the leak detection ratio (i.e., the fraction of detected leaks by deployed sensors) increases. Right: Source coverage for 15 sensors. The arrows map each sensor (red circles) to the sources (black triangles) that it detects. End-to-end data pipeline for methane IoT sensors To achieve continuous monitoring of methane emissions from oil and gas assets, Microsoft has implemented an end-to-end solution pipeline where streaming data from IoT Hub is ingested into a Bronze Delta Lake table leveraging Structured Streaming on Spark. Sensor data cleaning, aggregation, and transformation to algorithm data model are done and the resultant data is stored in a Silver Delta Lake table in a format that is optimized for downstream AI tasks. Methane leak detection is performed using uni- and multi-variate anomaly detection models for improved reliability. Once a leak has been detected, its severity is also computed, and the emission source attribution and quantification algorithm then identifies the likely source of the leak and quantifies the leak rate. This event information is sent to the Accenture Work Order Prioritization module to trigger appropriate alerts based on the severity of the leak to enable timely remediation of fugitive or venting emissions. The quantified leaks can also be recorded and reported using tools such as the Microsoft Sustainability Manager app. The individual components of this end-to-end pipeline are described in the sections below and illustrated in Figure 5.

Figure 5: End-to-end IoT data pipeline that runs on Microsoft Azure demonstrating methane leak detection, quantification, and remediation capabilities. Digital twin for methane IoT sensors Data streaming from IoT sensors deployed in the field needs to be orchestrated and reliably passed to the processing and AI execution pipeline. Microsoft’s solution creates a digital twin for every sensor. The digital twin comprises a sensor simulation module that is leveraged in different stages of the methane solution pipeline. The simulator is used to test the end-to-end pipeline before field deployment, reconstruct and analyze anomalous events through what-if scenarios and enable the source attribution and leak quantification module through a simulation-based, inverse modeling approach. Anomaly (leak) detection A methane leak at a source could manifest as an unusual rise in the methane concentration detected at nearby sensor locations that require timely mitigation. The first step towards identifying such an event is to trigger an alert through the anomaly detection system. A severity score is computed for each anomaly to help prioritize alerts. Microsoft provides the following two methods for time series anomaly detection, leveraging Microsoft’s open-source SynapseML library, which is built on the Apache Spark distributed computing framework and simplifies the creation of massively scalable machine learning pipelines: Univariate anomaly detection: Based on a single variable, for example, methane concentration. Multivariate anomaly detection: Used in scenarios where multiple variables, including methane concentration, wind speed, wind direction, temperature, relative humidity, and atmospheric pressure, are used to detect an anomaly. Post-processing steps are implemented to reliably flag true anomalous events so that remedial actions can be taken in a timely manner while reducing false positives to avoid unnecessary and expensive field trips for personnel. Figure 6 below illustrates this feature in Accenture’s MEMP: the ‘hover box” over Sensor 6 documents a total of seven alerts resulting in just two work orders being created.

Figure 6: MEMP dashboard visualizing alerts and resulting work orders for Sensor 6. Emission source attribution and quantification Once deployed in the field, methane IoT sensors can only measure compound signals in the proximity of their location. For an area of interest that is densely populated with potential emission sources, the challenge is to identify the source(s) of the emission event. Microsoft provides two approaches for identifying the source of a leak: Area of influence attribution model: Given the sensor measurements and location, an “area of influence” is computed for a sensor location at which a leak is detected, based on the real-time wind direction and asset geo-location. Then, the asset(s) that lie within the computed “area of influence” are identified as potential emissions sources for that flagged leak. Bayesian attribution model: With this approach, source attribution is achieved through inversion of the methane dispersion model. The Bayesian approach comprises two main components—a source leak quantification model and a probabilistic ranking model—and can account for uncertainties in the data stemming from measurement noise, statistical and systematic errors, and provides the most likely sources for a detected leak, the associated confidence level and leak rate magnitude. Considering the high number of sources, low number of sensors, and the variability of the weather, this poses a complex but highly valuable inverse modeling problem to solve. Figure 7 provides insight regarding leaks and work orders for a particular well (Well 24). Specifically, diagrams provide well-centric and sensor-centric assessments that attribute a leak to this well.

Figure 7: Leak Source Attribution for Well 24. Further, Accenture’s Work Order Prioritization module using Microsoft Dynamics 365 Field Service application (Figure 8) enables Energy operators to initiate remediation measures under the Leak Detection and Remediation (LDAR) paradigm.

Figure 8: Dynamics 365 Work Order with emission source attribution and CH4 concentration trend data embedded. Looking ahead In partnership with Microsoft, Accenture is looking to continue refining MEMP, which is built on the advanced AI and statistical models presented in this blog. Future capabilities of MEMP look to move from “detection and remediation” to “prediction and prevention” of emission events, including enhanced event quantification and source attribution. Microsoft and Accenture will continue to invest in advanced capabilities with an eye toward both: Integrating industry standards platforms such as Azure Data Manager for Energy (ADME) and Open Footprint Forum to enable both publishing and consumption of emissions data. Leveraging Generative AI to simplify the user experience. Learn more Case study Duke Energy is working with Accenture and Microsoft on the development of a new technology platform designed to measure actual baseline methane emissions from natural gas distribution systems. Accenture Methane Emissions Monitoring Platform More information regarding Accenture’s MEMP can be found in “More than hot air with methane emissions”. Additional information regarding Accenture can be found on the Accenture homepage and on their energy page. Microsoft Azure Data Manager for Energy Azure Data Manager for Energy is an enterprise-grade, fully managed, OSDU Data Platform for the energy industry that is efficient, standardized, easy to deploy, and scalable for data management—ingesting, aggregating, storing, searching, and retrieving data. The platform will provide the scale, security, privacy, and compliance expected by our enterprise customers. The platform offers out-of-the-box compatibility with major service company applications, which allows geoscientists to use domain-specific applications on data contained in Azure Data Manager for Energy with ease. Related publications and conference presentations Source Attribution and Emissions Quantification for Methane Leak Detection: A Non-Linear Bayesian Regression Approach. Mirco Milletari, Sara Malvar, Yagna Oruganti, Leonardo Nunes, Yazeed Alaudah, Anirudh Badam. The 8th International Online & Onsite Conference on Machine Learning, Optimization, and Data Science. Surrogate Modeling for Methane Dispersion Simulations Using Fourier Neural Operator. Qie Zhang, Mirco Milletari, Yagna Oruganti, Philipp Witte. Presented at the NeurIPS 2022 Workshop on Tackling Climate Change with Machine Learning. 1https://climate.nasa.gov/news/3246/nasa-says-2022-fifth-warmest-year-on-record-warming-trend-continues/ The post Microsoft and Accenture partner to tackle methane emissions with AI technology appeared first on Azure Blog.

21
1
submitted 1 year ago by [email protected] to c/[email protected]
  1. Overview

In this tutorial, we'll discuss the enhanced Testcontainers support introduced in Spring Boot 3.1. This update provides a more streamlined approach to configuring the containers, and it allows us to start them for local development purposes. As a result, developing and running tests using Testcontainers becomes a seamless and efficient process. 2. Testcontainers Prior to SpringBoot 3.1

We can use Testcontainers to create a production-like environment during the testing phase. By doing so, we'll eliminate the need for mocks and write high-quality automated tests that aren't coupled to the implementation details. For the code examples in this article, we'll use a simple web application with a MongoDB database as a persistence layer and a small REST interface: @RestController @RequestMapping("characters") public class MiddleEarthCharactersController { private final MiddleEarthCharactersRepository repository; // constructor not shown @GetMapping public List findByRace(@RequestParam String race) { return repository.findAllByRace(race); } @PostMapping public MiddleEarthCharacter save(@RequestBody MiddleEarthCharacter character) { return repository.save(character); } } During the integration tests, we'll spin up a Docker container containing the database server. Since the database port exposed by the container will be dynamically allocated, we cannot define the database URL in the properties file. As a result, for a Spring Boot application with a version prior to 3.1, we'd need to use @DynamicPropertySource annotation in order to add these properties to a DynamicPropertyRegistry: @Testcontainers @SpringBootTest(webEnvironment = DEFINED_PORT) class DynamicPropertiesIntegrationTest {

@Container
static MongoDBContainer mongoDBContainer = new MongoDBContainer(DockerImageName.parse("mongo:4.0.10"));
@DynamicPropertySource 
static void setProperties(DynamicPropertyRegistry registry) {
    registry.add("spring.data.mongodb.uri", mongoDBContainer::getReplicaSetUrl);
}
// ...

} For the integration test, we'll use the @SpringBootTest annotation to start the application on the port defined in the configuration files. Additionally, we'll use Testcontainers for setting up the environment. Finally, let's use REST-assured for executing the HTTP requests and asserting the validity of the responses: @Test void whenRequestingHobbits_thenReturnFrodoAndSam() { repository.saveAll(List.of( new MiddleEarthCharacter("Frodo", "hobbit"), new MiddleEarthCharacter("Samwise", "hobbit"), new MiddleEarthCharacter("Aragon", "human"), new MiddleEarthCharacter("Gandalf", "wizzard") )); when().get("/characters?race=hobbit") .then().statusCode(200) .and().body("name", hasItems("Frodo", "Samwise")); } 3. Using @ServiceConnection for Dynamic Properties

Starting with SpringBoot 3.1, we can utilize the @ServiceConnection annotation to eliminate the boilerplate code of defining the dynamic properties. Firstly, we'll need to include the spring-boot-testcontainers dependency in our pom.xml:

org.springframework.boot spring-boot-testcontainers test

After that, we can remove the static method that registers all the dynamic properties. Instead, we'll simply annotate the container with @ServiceConnection: @Testcontainers @SpringBootTest(webEnvironment = DEFINED_PORT) class ServiceConnectionIntegrationTest { @Container @ServiceConnection static MongoDBContainer mongoDBContainer = new MongoDBContainer(DockerImageName.parse("mongo:4.0.10")); // ... } The @ServiceConncetion allows SpringBoot's autoconfiguration to dynamically register all the needed properties. Behind the scenes, @ServiceConncetion determines which properties are needed based on the container class, or on the Docker image name. A list of all the containers and images that support this annotation can be found in Spring Boot's official documentation. 4. Testcontainers Support for Local Development

Another exciting feature is the seamless integration of Testcontainers into local development with minimal configuration. This functionality enables us to replicate the production environment not only during testing but also for local development. In order to enable it, we first need to create a @TestConfiguration and declare all the Testcontainers as Spring Beans. Let's also add the @ServiceConnection annotation that will seamlessly bind the application to the database: @TestConfiguration(proxyBeanMethods = false) class LocalDevTestcontainersConfig { @Bean @ServiceConnection public MongoDBContainer mongoDBContainer() { return new MongoDBContainer(DockerImageName.parse("mongo:4.0.10")); } } Because all the Testcontainers dependencies are being imported with a test scope, we'll need to start the application from the test package. Consequently, let's create in this package a main() method that calls the actual main() method from the java package: public class LocalDevApplication {
public static void main(String[] args) { SpringApplication.from(Application::main) .with(LocalDevTestcontainersConfig.class) .run(args); } } This is it. Now we can start the application locally from this main() method and it will use the MongoDB database. Let's send a POST request from Postman and then directly connect to the database and check if the data was correctly persisted:  

In order to connect to the database, we'll need to find the port exposed by the container. We can fetch it from the application logs or simply by running the docker ps command:  

Finally, we can use a MongoDB client to connect to the database using the URL mongodb://localhost:63437/test, and query the characters collection:   That's it, we're able to connect and query to the database started by the Testcontainer for local development. 5. Integration With DevTools and @RestartScope

If we restart the application often during the local development, a potential downside would be that all the containers will be restarted each time. As a result, the start-up will potentially be slower and the test data will be lost. However, we can keep containers alive when the application is being shut down by leveraging the Testcontainers integration with spring-boot-devtools. This is an experimental Testcontainers feature that enables a smoother and more efficient development experience, as it saves valuable time and test data. Let's start by adding the spring-boot-devtools dependency:

org.springframework.boot
spring-boot-devtools
runtime
true

Now, we can go back to the test configuration for local development and annotate the Testcontainers beans with the @RestartScope annotation: @Bean @RestartScope @ServiceConnection public MongoDBContainer mongoDBContainer() { return new MongoDBContainer(DockerImageName.parse("mongo:4.0.10")); } Alternatively, we can use the withReuse(true) method of the Testcontainer's API: @Bean @ServiceConnection public MongoDBContainer mongoDBContainer() { return new MongoDBContainer(DockerImageName.parse("mongo:4.0.10")) .withReuse(true); } As a result, we can now start the application from the main() method from the test package and take advantage of the spring-boot-devtools live-reload functionality. For instance, we can save an entry from Postman, then recompile and reload the application:   Let's introduce a minor change like switching the request mapping from “characters” to “api/characters” and re-compile:

We can already see from the application logs or from Docker itself that the database container wasn't restarted. Nevertheless, let's go one step further and check that the application reconnected to the same database after the restart. For example, we can do this by sending a GET request at the new path and expecting the previously inserted data to be there:  

  1. Conclusion

In this article, we've discussed SpringBoot 3.1's new Testcontainers features. We learned how to use the new @ServiceConnection annotation that provides a streamlined alternative to using @DynamicPropertySource and the boilerplate configuration. Following that, we delved into utilizing Testcontainers for local development by creating an additional main() method in the test package and declaring them as Spring beans. In addition to this, the integration with spring-boot-devtools and @RestartScope enabled us to create a fast, consistent, and reliable environment for local development. As always, the complete code used in this article is available over on GitHub.

22
1
submitted 1 year ago by [email protected] to c/[email protected]
  1. Overview

In Java programming, dealing with strings and patterns is essential to many applications. Regular expressions, commonly known as regex, provide a powerful tool for pattern matching and manipulation. Sometimes, we not only need to identify matches within a string, but also locate exactly where these matches occur. In this tutorial, we'll explore getting the indexes of regex pattern matches in Java. 2. Introduction to the Problem

Let's start with a String example: String INPUT = "This line contains , , and ."; Let's say we want to extract all “<…>” segments from the string above, such as “” and ““. To match these segments, we can use regex's NOR character classes: “<[^>]>”.  In Java, the Pattern and Matcher classes from the Regex API are important tools for working with pattern matching. These classes provide methods to compile regex patterns and apply them to strings for various operations. So next, let's use Pattern and Matcher to extract the desired text. For simplicity, we'll use AssertJ assertions to verify whether we obtained the expected result: Pattern pattern = Pattern.compile("<[^>]>"); Matcher matcher = pattern.matcher(INPUT); List result = new ArrayList<>(); while (matcher.find()) { result.add(matcher.group()); } assertThat(result).containsExactly("", "", ""); As the code above shows, we extracted all “<…>” parts from the input String. However, sometimes, we want to know exactly where matches are located in the input. In other words, we want to obtain the matches and their indexes in the input string. Next, let's extend this code to achieve our goals. 3. Obtaining Indexes of Matches

We've used the Matcher class to extract the matches. The Matcher class offers two methods, start() and end(), which allows us to obtain each match's start and end indexes.  It's worth noting that the Matcher.end() method returns the index after the last character of the matched subsequence. An example can show this clearly: Pattern pattern = Pattern.compile("456"); Matcher matcher = pattern.matcher("0123456789"); String result = null; int startIdx = -1; int endIdx = -1; if (matcher.find()) { result = matcher.group(); startIdx = matcher.start(); endIdx = matcher.end(); } assertThat(result).isEqualTo("456"); assertThat(startIdx).isEqualTo(4); assertThat(endIdx).isEqualTo(7); // matcher.end() returns 7 instead of 6 Now that we understand what start() and end() return, let's see if we can obtain the indexes of each matched “<…>” subsequence in our INPUT: Pattern pattern = Pattern.compile("<[^>]*>"); Matcher matcher = pattern.matcher(INPUT); List result = new ArrayList<>(); Map indexesOfMatches = new LinkedHashMap<>(); while (matcher.find()) { result.add(matcher.group()); indexesOfMatches.put(matcher.start(), matcher.end()); } assertThat(result).containsExactly("", "", ""); assertThat(indexesOfMatches.entrySet()).map(entry -> INPUT.substring(entry.getKey(), entry.getValue())) .containsExactly("", "", ""); As the test above shows, we stored each match's start() and end() results in a LinkedHashMap to preserve the insertion order. Then, we extracted substrings from the original input by these index pairs. If we obtained the correct indexes, these substrings must equal the matches. If we give this test a run, it passes. 4. Obtaining Indexes of Matches With Capturing Groups

In regex, capturing groups play a crucial role by allowing us to reference them later or conveniently extract sub-patterns. To illustrate, suppose we aim to extract the content enclosed between ‘<‘ and ‘>‘. In such cases, we can create a pattern that incorporates a capturing group: “<([^>])>”. As a result, when utilizing Matcher.group(1), we obtain the text “the first value“,  “the second value“, and so on. When no explicit capturing group is defined, the entire regex pattern assumes the default group with the index 0. Therefore, invoking Matcher.group() is synonymous with calling Matcher.group(0). Much like the behavior of the Matcher.group() function, the Matcher.start() and Matcher.end() methods offer support for specifying a group index as an argument. Consequently, these methods provide the starting and ending indexes corresponding to the matched content within the corresponding group: Pattern pattern = Pattern.compile("<([^>])>"); Matcher matcher = pattern.matcher(INPUT); List result = new ArrayList<>(); Map indexesOfMatches = new LinkedHashMap<>(); while (matcher.find()) { result.add(matcher.group(1)); indexesOfMatches.put(matcher.start(1), matcher.end(1)); } assertThat(result).containsExactly("the first value", "the second value", "the third value"); assertThat(indexesOfMatches.entrySet()).map(entry -> INPUT.substring(entry.getKey(), entry.getValue())) .containsExactly("the first value", "the second value", "the third value"); 5. Conclusion

In this article, we explored obtaining the indexes of pattern matches within the original input when dealing with regex. We discussed scenarios involving patterns with and without explicitly defined capturing groups. As always, the complete source code for the examples is available over on GitHub.

23
1
submitted 1 year ago by [email protected] to c/[email protected]

Check the JVMLS 2023 playlist for more videos.

24
0
submitted 1 year ago by [email protected] to c/[email protected]
  1. Overview

Extracting specific content from within patterns is common when we work with text processing. Sometimes, when dealing with data that uses square brackets to encapsulate meaningful information, extracting text enclosed within square brackets might be a challenge for us. In this tutorial, we'll explore the techniques and methods to extract content between square brackets. 2. Introduction to the Problem

First of all, for simplicity, let's make two prerequisites to the problem: No nested square bracket pairs – For example, patterns like “..[value1 [value2]]..” won't come as our input. Square brackets are always well-paired – For instance, “.. [value1 …” is an invalid input. When discussing input data enclosed within square brackets, we encounter two possible scenarios: Input with a single pair of square brackets, as seen in “..[value]..” Input with multiple pairs of square brackets, illustrated by “..[value1]..[value2]..[value3]…” Moving forward, our focus will be on addressing the single-pair scenario first, and then we'll proceed to adapt the solutions for cases involving multiple pairs. Throughout this tutorial, the primary technique we'll use to solve these challenges will be Java regular expression (regex). 3. Input With a Single Pair of Square Brackets

Let's say we're given a text input: String INPUT1 = "some text [THE IMPORTANT MESSAGE] something else"; As we can see, the input contains only one square bracket pair, and we aim to get the text in between: String EXPECTED1 = "THE IMPORTANT MESSAGE"; So next, let's see how to achieve that. 3.1. The [.*] Idea

A direct approach to this problem involves extracting content between the ‘[‘ and ‘]‘ characters. So, we may come up with the regex pattern “[.]”. However, we cannot use this pattern directly in our code, as regex uses ‘[‘ and ‘]‘ for character class definitions. For example, the “[0-9]” class matches any digit character. We must escape them to match literal ‘[‘ or ‘]‘. Furthermore, our task is extracting instead of matching. Therefore, we can put our target match in a capturing group so that it's easier to be referenced and extracted later: String result = null; String rePattern = "\[(.)]"; Pattern p = Pattern.compile(rePattern); Matcher m = p.matcher(INPUT1); if (m.find()) { result = m.group(1); } assertThat(result).isEqualTo(EXPECTED1); Sharp eyes may notice that we only escaped opening ‘[‘ in the above code. This is because, for brackets and braces, if a closing bracket or brace isn't preceded by its corresponding opening character, the regex engine interprets it literally. In our example, we escaped ‘\[‘, so ‘]‘ isn't preceded by any opening ‘[‘. Thus, ‘]‘  will be treated as a literal ‘]‘ character. 3.2. Using NOR Character Classes

We've solved the problem by extracting “everything” between ‘[‘ and ‘]‘. Here, “everything” consists of characters that aren't ‘]'. Regex supports NOR class. For instance, “[^0-9]” matches any non-digit character. Therefore, we can elegantly address this issue by employing regex NOR classes, resulting in the pattern “\[([^]])“: String result = null; String rePattern = "\[([^]])"; Pattern p = Pattern.compile(rePattern); Matcher m = p.matcher(INPUT1); if (m.find()) { result = m.group(1); } assertThat(result).isEqualTo(EXPECTED1); 3.3. Using the split() Method

Java offers the powerful String.split() method to break the input string into pieces. split() supports the regex pattern as the delimiter. Next, let's see if our problem can be solved by the split() method. Consider the scenario of “prefix[value]suffix”. If we designate ‘[‘ or ‘]‘ as the delimiter, split() would yield an array: {“prefix”, “value”, “suffix”}. The next step is relatively straightforward. We can simply take the middle element from the array as a result: String[] strArray = INPUT1.split("[\[\]]"); String result = strArray.length == 3 ? strArray[1] : null; assertThat(result).isEqualTo(EXPECTED1); In the code above, we ensure the split result should always have three elements before taking the second element out of the array. The test passes when we run it. However, this solution may fail if the input is ending with ‘]‘: String[] strArray = "[THE IMPORTANT MESSAGE]".split("[\[\]]"); assertThat(strArray).hasSize(2) .containsExactly("", "THE IMPORTANT MESSAGE"); As the test above shows, our input doesn't have “prefix” and “suffix” this time. By default, split() discards the trailing empty strings. To solve it, we can pass a negative limit to split(), to tell split() to keep the empty string elements: strArray = "[THE IMPORTANT MESSAGE]".split("[\[\]]", -1); assertThat(strArray).hasSize(3) .containsExactly("", "THE IMPORTANT MESSAGE", ""); Therefore, we can change our solution to cover the corner case: String[] strArray = INPUT1.split("[\[\]]", -1); String result = strArray.length == 3 ? strArray[1] : null; ... 4. Input With Multiple Square Brackets Pairs

After solving the single “[..]” pair case, extending the solutions to work with multiple “[..]” cases won't be a challenge for us. Let's take a new input example: final String INPUT2 = "[La La Land], [The last Emperor], and [Life of Pi] are all great movies."; Next, let's extract the three movie titles from it: final List EXPECTED2 = Lists.newArrayList("La La Land", "The last Emperor", "Life of Pi"); 4.1. The [(.*)] Idea – Non-Greedy Version

The pattern “\[(.)]” efficiently facilitates the extraction of desired content from a single “[..]” pair. But this won't work for inputs with multiple “[..]” pairs. This is because regex does greedy matching by default. In other words, if we match INPUT2 with “\[(.)]”, the capturing group will hold the text between the first ‘[‘ and the last ‘]‘: “La La Land], [The last Emperor], [Life of Pi“. However, we can add a ‘?‘ after ‘' to ensure regex does a non-greedy match. Additionally, as we'll extract multiple target values, let's change if (m.find()) to a while loop: List result = new ArrayList<>(); String rePattern = "\[(.?)]"; Pattern p = Pattern.compile(rePattern); Matcher m = p.matcher(INPUT2); while (m.find()) { result.add(m.group(1)); } assertThat(result).isEqualTo(EXPECTED2); 4.2. Using Character Classes

The NOR character class solution works for inputs with multiple “[..]” pairs too. We only need to change the if statement to a while loop: List result = new ArrayList<>(); String rePattern = "\[([^]]*)"; Pattern p = Pattern.compile(rePattern); Matcher m = p.matcher(INPUT2); while (m.find()) { result.add(m.group(1)); } assertThat(result).isEqualTo(EXPECTED2); 4.3. Using the split() Method

For inputs with multiple “[..]“s, if we split() by the same regex, the result array should have more than three elements. So, we can't simply take the middle (index=1) one: Input: "---[value1]---[value2]---[value3]---" Array: "---", "value1", "---", "value2", "---", "value3", "---" Index: [0] [1] [2] [3] [4] [5] [6] However, if we look at the indexes, we find all elements with odd indexes are our target values. Therefore, we can write a loop to get desired elements from split()‘s result: List result = new ArrayList<>(); String[] strArray = INPUT2.split("[\[\]]" -1); for (int i = 1; i < strArray.length; i += 2) { result.add(strArray[i]); } assertThat(result).isEqualTo(EXPECTED2); 5. Conclusion

In this article, we learned how to extract text between square brackets in Java. We learned different regex-related approaches to address the challenge, effectively tackling two problem scenarios. As always, the complete source code for the examples is available over on GitHub.

25
1
submitted 1 year ago by [email protected] to c/[email protected]
  1. Overview

Lombok is a Java library that helps to reduce boilerplate code like getters, setters, etc. OpenAPI provides a property to auto-generate a model with Lombok annotations. In this tutorial, we'll explore how to generate a model with Lombok annotations using an OpenAPI code generator. 2. Project Setup

To begin with, let's bootstrap a Spring Boot project and add the Spring Boot Starter Web and Lombok dependencies:

org.springframework.boot
spring-boot-starter-web
3.1.2


org.projectlombok
lombok
1.18.28
provided

Additionally, we need the Swagger Annotations, Gson, and Java Annotation API dependencies to prevent errors related to packages in the generated code:

javax.annotation
javax.annotation-api
1.3.2



com.google.code.gson
gson
2.10.1


io.swagger
swagger-annotations
1.6.2

In the next section, we'll create an API specification for a model named Book and later generate the code with Lombok annotation using the OpenAPI code generator. 3. Generating Model Using OpenAPI

The idea of OpenAPI is to write the API specification before actual coding begins. Here, we'll create a specification file and generate a model based on the specification. 3.1. Creating Model Specification

First, let's create a new file named bookapi.yml in the resources folder to define the Book specification: openapi: 3.0.2 info: version: 1.0.0 title: Book Store license: name: MIT paths: /books: get: tags: - book summary: Get All Books responses: 200: description: successful operation content: application/json: schema: $ref: '#/components/schemas/Book' 404: description: Book not found content: { } components: schemas: Book: type: object required: - id - name - author properties: id: type: integer format: int64 name: type: string author: type: string

In the specification above, we define the Book schema with id, name, and author fields. Additionally, we define an endpoint to get all stored books. 3.2. Generate a Model With Lombok Annotation

After defining the API specification, let's add the OpenAPI plugin to the pom.xml to help generate the code based on the specification:

org.openapitools
openapi-generator-maven-plugin
4.2.3

    
    
        generate
    
    
        ${project.basedir}/src/main/resources/bookapi.yml
	java
	
	    @lombok.Data @lombok.NoArgsConstructor @lombok.AllArgsConstructor
	
	false
	false
	false

Here, we specify the location of the specification file for the plugin to check during the generation process. Also, we add the additionalModelTypeAnnotations property to add three Lombok annotations to the model. For simplicity, we disable the generation of supporting files and API documentation. Finally, let's generate the Model by executing Maven install command: $ ./mvnw install The command above generates a Book model in the target folder. 3.3. Generated Code

Let's see the generated Book model: @lombok.Data @lombok.NoArgsConstructor @lombok.AllArgsConstructor @javax.annotation.Generated(value = "org.openapitools.codegen.languages.JavaClientCodegen", date = "2023-08-16T09:16:34.322697262Z[GMT]") public class Book { public static final String SERIALIZED_NAME_ID = "id"; @SerializedName(SERIALIZED_NAME_ID) private Long id;

public static final String SERIALIZED_NAME_NAME = "name";
@SerializedName(SERIALIZED_NAME_NAME)
private String name;

public static final String SERIALIZED_NAME_AUTHOR = "author";
@SerializedName(SERIALIZED_NAME_AUTHOR)
private String author;
// ... 

} In the generated code above, the three Lombok annotations we defined in the plugin using the additionalModelTypeAnnotations property are added to the model class. The @Data annotation helps generate the getters, setters, etc., at compile time. The @NoArgsConstructor generates an empty constructor, and the @AllArgsConstructor generates a constructor that takes an argument for all fields in the class. 4. Conclusion

In this article, we learned how to generate a model with Lombok annotation using the OpenAPI code generator. Adding the additionalModelTypeAnnotations property provides us the flexibility to add desired Lombok annotations. As always, the complete source code for the examples is available over on GitHub.

view more: next ›

Technology (RSS)

1 readers
0 users here now

Automated posts from aggregated RSS and Atom feeds.

Updated hourly.

founded 1 year ago
MODERATORS