Rubens A. Zimbres
Contact
Lecture at Google Sao Paulo - Brazil
Machine Learning Engineer at Zendata
Dual Master and Doctor in Business Administration and Electrical Engineering
Certified Google Cloud Professional Data Engineer
Google Developer Expert in AI/Machine Learning (NLP) and Google Cloud Platform (Security)
Mentor of Google for Startups Accelerator Brazil
CompTIA Security+ certified
AWS Certified Cloud Practitioner, AWS Certified AI Practitioner and AWS Certified Machine Learning - Specialty
Links
âš Disclaimer: five elements of this website were developed with Generative AI âš
My collection of pins from the Google Developers Experts program, Google Cloud Champions Innovators and NEXT '24. As they involve a certain level of difficulty and effort, this makes them special for me.
 Google facilities in Sunnyvale, California - North America Connect '24
đź‘˝ On April 9th I was on Google Cloud Next '24 in Las Vegas (Champions and Certified Lounge) presenting Langchain and Gemini deployed in a Google Cloud infrastructure (Dialogflow and Cloud Run) đź‘˝
Google Cloud Champions Innovators at Next'24 Las Vegas
APPS I DEVELOPED RECENTLY
APP 01 : Â MULTIMODAL GENERATIVE AI APP Â Â đź“˝
MULTIMODAL GENERATIVE AI APP
I developed this Generative AI App with Streamlit and deployed in a Google Cloud infrastructure running on a Cloud Run container. The Python code, requirements and Dockerfile were used to create an image in Artifact Registry via Docker and then deployed in Cloud Run.
The App runs a Gemini-1.5-Flash model and presents multimodality features, where you can generate a report via function calling, analyze a PDF file making math calculations, a price table in an image, audio from Apple Q2 2024 earnings report, and also a marketing video. All this data then serves as input to Gemini 1.5 Flash to make an overall analysis of the financial and marketing strategies. This help decision makers to take better decisions.
The app is totally customizable via code and gets its image and video data from Google Cloud storage buckets.
The infrastructure is cost-efficient, as it runs on machines with only 1024 MB of memory, and Gemini-1.5-Flash cost is 1/10 of the cost of Gemini-1.0-Pro. Safety settings were added in order to block harmful content.
 APP 02 :  IN-BROWSER LLM INFERENCE
IN-BROWSER LLM INFERENCE
I developed this app for sentiment analysis using tflite and Javascript. BERT model is transformed into a .tflite model. The JS code loads the BERT.tflite in the browser from a remote repo. Â
In-browser inference refers to the process of running machine learning models directly within a web browser, eliminating the need to send data back and forth to a server, as the App 01 does.
One can train any customized model in Tensorflow/PyTorch, convert to .tflite and deploy in the web browser.
 APP 03 :   REAL-TIME IN-BROWSER INFERENCE
The article explaining how to use a customized Tensorflow Lite model for in-browser inference is here.
IN-BROWSER ML INFERENCE
I developed this app for image classification using tflite and Javascript. Instead of the default and supported model EfficientNet, I used a PyTorch model. The JS code loads the quantized .tflite model in the browser from a remote repo.Â
This solution allows anyone to train any customized model in Tensorflow/PyTorch, convert to .tflite and deploy in the web browser.
 APP 04 :   LLM POWERED CHATBOT
LLM POWERED CHATBOT
I developed this Chatbot with Generative AI technology with the help of Anthropic's Claude. The idea here was to analyze Claude's capabilities, starting from scratch. Indeed, the chatbot interface took less that 15 minutes to get ready.
It is very simple: it's done with HTML, Javascript and CSS and the webpage calls an endpoint deployed in a Google Cloud infrastructure running on a Cloud Run container. The Python code, requirements and Dockerfile were used to create an image in Artifact Registry via Docker and then deployed in Cloud Run.
The App runs a Gemini-1.5-Flash model and it's a generic chatbot. It was neither fine-tuned nor it uses additional context (RAG).
There are some improvements that can be done:
Add RAG (Retrieval Augmented Generation)
Add Langchain Agents in the container, to do specific tasks, as query databases, documents, and other tasks.
The infrastructure is cost-efficient, as it runs on machines with only 1024 MB of memory, and Gemini-1.5-Flash cost is 1/20 of the cost of Gemini-1.0-Pro. Safety settings were added in order to block harmful content. The monthly fixed cost of the solution is 22 USD + Gemini-1.5-Flash API calls, that are very cheap.
PROJECTS WE DELIVERED
ACTIVITY
My Google Developers profile
Articles about Google Cloud infrastructure, Vertex AI, Recommenders, Agent-Based Modeling, Graph Neural Networks, RAG. We have 17K views per month.
Articles about Google Cloud, Deep Learning, NLP, Agent-Based Modeling, Kubernetes, Transformers, Ethical Hacking, Pentesting, Social Networks
Python codes along my journey as Data Scientist. Machine Learning, Deep Learning, NLP, Transformers, Google Cloud, Algorithmic trading, Tensorflow, Keras and PyTorch
My profile at Google Experts Directory
My badges related to the courses and LABs I completedÂ
My previous experience, education, licenses, certifications, published papers and patents
My questions and answers in StackOverflow. I am a recognized member of the Google Cloud Collective
My certifications in Machine Learning, Deep Learning, Google Cloud and Ethical Hacking
Academic papers published during my Master and Doctorate
My three patents, two in NLP and one in Cellular Automata
I am a featured contributor at Wolfram Community, with 2 Staff Picks and 40,000 article views
MY ARTICLES
A critique about a flaw in Kolmogorov-Arnold Networks, that seem to fit any random data.
Here, I used a customized Resnet50 in PyTorch to develop an in-browser inference with Tensorflow Lite and Javascript.
In this article I provide technical details of Gemini-1.5-Flash, benchmarks comparison and use cases exploring multimodality (text, audio, video and images)
Article about the use of neo4j-runway to turn CSVs into Knowledge Graphs and subsequent elaboration of a diagnostic hypothesis, using Gemini and no cypher.
Tutorial about fine tuning Google’s open model Gemma-2b via HuggingFace and PyTorch to solve Mathematical problems
Here, I host data and embeddings in a SQL instance and the Generative AI application will run on GKE. Similarity results will be given by a SQL query , inside the chatbot app.
Here, I explore the phases, methodologies, and potential impacts of a Transfer Learning attack.
Tutorial about building a Knowledge Graph from scratch using Neo4j and Cypher
Article about deploying Gemini on Google Kubernetes Engine (GKE) and Dialogflow.
In this article, I use LangChain and GPT-4 to evaluate Google’s open model Gemma-2B-it in 22 criteria.
I made Google's open source model Gemma collaborate with OpenAI’s gpt-3.5 to generate a graph plot from a simple natural language sentence at a low cost.Â
Here I create a Knowledge Graph storing scraped data in a structured manner, and using this data with LangChain to create a chatbot with memory
Article on how to use pgvector on a PostgreSQL database for RAG (Retrieval Augmented Generation) with LLMs.
A tutorial on how to identify the presence of Pegasus spyware on an iPhone
A Jupyter notebook about Two Towers Recommender in Tensorflow for Recruiting
Generative AI article on how to generate Python code using RAG + LangChain + LLM
Tutorial about installing an IDS for personal protection
OWASP Top 10 for LLMs, published at Google Developer Experts blog
Q&A Generative AI application in Google Cloud, published at Google Cloud blog
Generative AI automation to “read” and organize pedagogical projects in clusters
Setup of NVIDIA Merlin and Tensorflow for Recommendation Models, published in Google Developer Experts
Google Cloud Contact Center Artificial Intelligence (CCAI): A Managerial View
Two Towers Recommender: A Custom Pipeline in Vertex AI Using Kubeflow, published in Google Developer Experts
Search of Brazilian Laws using Dialogflow CX chatbot engine and Vertex AI Matching Engine
Graph Neural Networks: the message passing algorithm, published in Google Developer Experts
Develop Secure End-to-End Machine Learning Solutions in Google Cloud
My article on Attacking Active Directory in a Windows Server network with Kali Linux
Agent-Based Modeling with Python, NetLogo and Arduino
Burn a physical security key using a nRF52840 Dongle from Nordic to securely access your Google / Google Cloud accounts
In this article I will present the steps to create a Generative Adversarial Network.Â
In this post, I explain how to run an IoT project from the command line, using Ubuntu Core in a Raspberry Pi 3.Â
My first project with an IoT device and AWS IoT. It collects CPU Temperature in real time, send to Amazon AWS IoT and make it available for Machine Learning models and dashboards.
Based on people's contacts with each other, you can easily see the whole social network analysis.
My attempt to explain Schrodinger's Equation and CP Violation.
This is a code I developed with Wolfram Mathematica, while trying to solve an Atari game. An application in traffic is presented.
PRESENTATIONS
RECOMMENDED CONTENT
People of AI is a podcast of Gus Martins and Ashley Oldacre from Google showcasing inspiring people with interesting stories in the world of Artificial Intelligence (AI) and its subset, Machine Learning (ML).Â
The podcast will interview leaders, practitioners, researchers and learners in the field of AI/ML and invite them to share their stories, what they are building, lessons learned along the way, and excitement for the AI/ML industry.
For all the episodes, visit the People of AI page at: https://peopleofai.libsyn.com/
GALLERY
EXPERIMENTS
CONTACT