This April 6, 2022 is marked by the new edition of the Data Cloud Summit, an annual show organized by Google Cloud. Many conferences are organized to announce some of the new features developed by the cloud computing subsidiary of the Mountain View firm or to discuss certain themes around data management.
Updates, launch of previews of several APIs and solution suites, partnerships and collaborations: feedback on the new features presented by Google Cloud as part of this event.
AMD Acquires Data Center Optimization Champion Pensando for $1.9 Billion
With BigLake, Google Cloud wants to unify data lakes and data warehouses
The first novelty announced by Google Cloud is the upcoming arrival of BigLake. This tool should allow companies to better manage their data lakes. It is a storage method that can be used to hold or manipulate large amounts of data.
In general, data lakes are opposed to data warehouses, which are more traditional and still used today by many structures. While a data warehouse can only accommodate structured data (predefined and formatted according to a precise structure) so that it can be processed as quickly as possible, this is not the case with data lakes where data can be stored under a raw form to be then modeled according to the desires and needs of the user, this is called schema-on-read.
However, the (poor) management and storage of data in a data lake can lead to the creation of silos: sets of raw data to which only part of the company has access (a business unit, a business department, etc. ), the rest of the company cannot access it.
Their accumulation can become a real obstacle for a company, since several entities can store similar data in the same data lake, thus creating two identical silos, which takes up space in the storage environment, in particular resulting in storage costs. unnecessary. They are also a hindrance for the organization of a company, each entity must first notice that it does not have access to certain data present in the data lake and then initiate a process allowing it to have access to it. and use them, which is a real waste of time.
With BigLake, Google Cloud offers an API that allows organizations to unify their data lakes and data warehouses to analyze and better manage their data without worrying about its format (raw or structured) or the storage method used. This solution will be integrated with BigQuery, Google Cloud’s flagship SaaS software.
Updated Spanner with the addition of Spanner change streams functionality
In addition to BigLake, Google Cloud also features Spanner change streams. This new update should also put an end to the limits imposed by the management of data on users of Spanner, one of the services of management and data storage of the subsidiary of Google.
Companies will now be able to monitor changes in their databases in real time in order to adapt more quickly to these changes. They will be able to replicate the changes made in Spanner to BigQuery in order to have access to real-time analyzes or to see what changes could be made in the event of a change.
Vertex AI Workbench Update: Cloud Computing to Help Develop Machine Learning Models
During Google Cloud Next 2021, the Mountain View firm announced that it would update Vertex AI Workbench in the coming months. This is now the case. The suite of solutions for building, training, and deploying machine learning models has been optimized to work best with BigQuery, Serverless Spark, and Dataproc.
According to Google Cloud, Vertex AI will allow artificial intelligence specialists to design machine learning models ” five times faster than traditional notebooks “. They will be able to regularly update their models using data stored in the cloud.
New functionalities will be brought thanks to the addition of Vertex AI Model Registry. This tool provides a repository for adding, discovering, using and manipulating machine learning models, in particular so that AI model developers can more easily share their models with application developers wishing to exploit these algorithms.
Connected Sheets for Looker: using data to optimize decision-making
Looker is a business intelligence and business intelligence platform that allows users to use data to improve productivity or bring innovation to a company through effective decision-making.
Google Cloud announces the launch of Connected Sheets, a tool offering the possibility of accessing Looker data models in Data Studio (a solution allowing the creation of dashboards and informative reports thanks to data) or in Google Sheets (the Google spreadsheet). The goal according to Google Cloud is to ensure that ” Looker users can more easily access data-driven insights to drive innovation and make data-driven decisions by unifying all the tools needed to do so “.
The Data Cloud Alliance: a working group to facilitate access to data management for all
For Gerrit Kazmaier, vice president of data analytics and databases for Google Cloud, “data is the common foundation of all digital transformations.” Along with multiple cloud providers and data stewards, Google Cloud is behind the Data Cloud Alliance initiative. It includes Confluent, Databricks which announced last year to invest in low-code/no-code, Dataiku, Deloitte, Elastic, Fivetran, MongoDB, Neo4j, Redis and Starbust.
The primary objective of this group will be to solve, together, the modern challenges related to digital transformation by committing to make data management more accessible, using various and varied platforms, systems and technologies (infrastructure dedicated, creation of APIs and data integration support). All members of this alliance will work together to reduce the complexity associated with data governance.
With the upcoming arrival of all these new features, Google Cloud is showing its ambitions by offering varied and ever more efficient solutions and updates, making it possible to improve the daily lives of businesses through the use of data and cloud computing.
In recent years, several structures have partnered with Google Cloud to give a more prominent role to the cloud and data in the transformation of their business. This was the case of Renault, in order to digitize its supply chain, Twitter, YouTube or Lydia. More recently, Japan, in its desire to create a single platform bringing together all government services, called on the cloud branch of Google.