DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Enterprise AI Trend Report: Gain insights on ethical AI, MLOps, generative AI, large language models, and much more.

2024 Cloud survey: Share your insights on microservices, containers, K8s, CI/CD, and DevOps (+ enter a $750 raffle!) for our Trend Reports.

PostgreSQL: Learn about the open-source RDBMS' advanced capabilities, core components, common commands and functions, and general DBA tasks.

AI Automation Essentials. Check out the latest Refcard on all things AI automation, including model training, data security, and more.

Core Badge
Avatar

Tim Spann

DZone Core CORE

Principal Developer Advocate at Cloudera

Hightstown, US

Joined Jun 2008

https://www.datainmotion.dev/

About

Tim Spann is a Principal Developer Advocate in Data In Motion for Cloudera. He works with Apache NiFi, Apache Pulsar, Apache Kafka, Apache Flink, Flink SQL, Apache Pinot, Trino, Apache Iceberg, DeltaLake, Apache Spark, Big Data, IoT, Cloud, AI/DL, machine learning, and deep learning. Tim has over a ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Developer Advocate at StreamNative, Principal DataFlow Field Engineer at Cloudera, a Senior Solutions Engineer at Hortonworks, a Senior Solutions Architect at AirisData, a Senior Field Engineer at Pivotal and a Team Leader at HPE. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton & NYC on Big Data, Cloud, IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as ApacheCon, DeveloperWeek, Pulsar Summit and many more. He holds a BS and MS in computer science

Stats

Reputation: 16466
Pageviews: 3.5M
Articles: 63
Comments: 31

Expertise

AI/ML Expertise Icon

AI/ML

IoT Expertise Icon

IoT

  • Articles
  • Refcards
  • Trend Reports
  • Events
  • Comments

Articles

article thumbnail
Mixtral: Generative Sparse Mixture of Experts in DataFlows
Explore the use of a new type of GenAI LLM with streaming pipelines in this tutorial about how to build a real-time LLM flow with Mixtral AI's new open model.
March 13, 2024
· 1,792 Views · 2 Likes
article thumbnail
Building a Generative AI Processor in Python
Why not create a Python Processor for Apache NiFi 2.0.0? In this tutorial, discover whether the challenge to do so is easy or difficult.
January 23, 2024
· 3,010 Views · 5 Likes
article thumbnail
Building a Real-Time Slackbot With Generative AI
Learn how to build a cool Slackbot with Apache NiFi, LLM, Foundation Models, and streaming. We will cover model choices and integration.
November 29, 2023
· 2,541 Views · 5 Likes
article thumbnail
Real-Time Analytics: All Data, Any Data, Any Scale, at Any Time
All data, any data, any scale, at any time: Learn why data pipelines need to embrace real-time data streams to harness the value of data as it is created.
October 3, 2023
· 5,757 Views · 6 Likes
article thumbnail
What Is a Modern Developer? In Today’s World, It’s a Citizen Engineer
The modern developer is an overarching term covering a large variety of different roles, responsibilities, and skills. In today’s world, it’s a citizen engineer.
July 20, 2023
· 11,498 Views · 7 Likes
article thumbnail
Streaming Change Data Capture Data Two Ways
Walk through how to use Debezium with Flink, Kafka, and NiFi for Change Data Capture using two different mechanisms: Kafka Connect and Flink SQL.
July 3, 2023
· 5,291 Views · 2 Likes
article thumbnail
Harnessing the Power of NiFi: Building a Seamless Flow To Ingest PM2.5 Data From a MiNiFi Java Agent With a Particle Sensor
In this tutorial, discover how to use MiNiFi, NiFi, Kafka, and Flink for sensor ingest, processing, analytics, and visualization.
June 13, 2023
· 5,305 Views · 4 Likes
article thumbnail
Real-Time Stream Processing With Hazelcast and StreamNative
In this article, readers will learn about real-time stream processing with Hazelcast and StreamNative in a shorter time, along with demonstrations and code.
Updated January 31, 2023
· 11,947 Views · 6 Likes
article thumbnail
How to Create a Real-Time Scalable Streaming App Using Apache NiFi, Apache Pulsar, and Apache Flink SQL
In this article, we'll cover how and when to use Pulsar with NiFi and Flink as you build your streaming application.
January 22, 2023
· 6,837 Views · 7 Likes
article thumbnail
Building Real-Time Weather Dashboards With Apache Pinot
Let's build a real-time weather dashboard application with Apache Pinot and Apache Pulsar.
December 11, 2022
· 6,441 Views · 4 Likes
article thumbnail
Real-Time Pulsar and Python Apps on a Pi
Build a Python application on a Raspberry Pi that streams sensor data and more from the edge to any and all data stores while processing data in event time.
April 5, 2022
· 8,657 Views · 7 Likes
article thumbnail
Deploying AI With an Event-Driven Platform
Explore advantages of an event-driven platform for model deployment of your platform and create greater accessibility to your model classifications results.
March 14, 2022
· 7,519 Views · 3 Likes
article thumbnail
Generating Simulated Streaming Data
In this article, learn more about using the Python library, Faker, to build synthetic data for tests and utilize Pulsar to send messages to topics at scale.
March 6, 2022
· 6,901 Views · 5 Likes
article thumbnail
Pulsar in Python on Pi for Sensors
Utilizing Apache Pulsar's Python Client on Raspberry Pi - FLiP-Py Stack
February 27, 2022
· 14,187 Views · 4 Likes
article thumbnail
Real-Time Edge Application With Apache Pulsar
In this article, you will learn how to build edge applications using Pulsar, the challenges of developing edge applications and why Apache Pulsar is the solution.
December 18, 2021
· 26,384 Views · 4 Likes
article thumbnail
Introducing Cloudera SQL Stream Builder (SSB)
SSB is an improved release of Eventador's SQL Stream Builder with integration into Cloudera Manager, Cloudera Flink, and other streaming tools.
Updated June 6, 2021
· 13,995 Views · 5 Likes
article thumbnail
Modern Apache NiFi Load Balancing
In this article, we discuss the newest ways to perform load balancing in Apache NiFi (version 1.8.0^) that now make Remote Process Groups obsolete.
December 19, 2019
· 23,630 Views · 3 Likes
article thumbnail
Exploring Apache NiFi 1.10: Parameters and Stateless Engine
In this article, we discuss the new version of Apache NiFi and how to use two of the biggest new features: parameters and stateless.
November 26, 2019
· 28,411 Views · 4 Likes
article thumbnail
Migrating Apache Flume Flows to Apache NiFi: Kafka Source to Multiple Sinks
How-to move off of legacy Flume and into modern Apache NiFi for data pipelines.
October 15, 2019
· 18,097 Views · 9 Likes
article thumbnail
Real-Time Transit Feed Data Processing With Apache NiFi
Ingesting and processing real-time transit feeds at scale with Apache NiFi.
August 26, 2019
· 13,246 Views · 4 Likes
article thumbnail
Creating Apache Kafka Topics Dynamically as Part of a DataFlow
Creating Kafka topics programmatically as part of streaming.
August 13, 2019
· 25,370 Views · 5 Likes
article thumbnail
Edge Data Processing With Jetson Nano
Learn more about edge data processing with Jetson Nano.
July 24, 2019
· 13,184 Views · 5 Likes
article thumbnail
Advanced XML Processing With Apache NiFi 1.9.1
In this post, we'll be using Apache NiFi to simply process very complex XML and RSS data files.
April 2, 2019
· 19,014 Views · 7 Likes
article thumbnail
HDP 3.1 Released! All The Kafka!
A major upgrade to Hadoop distribution has been released. Read on to learn how to upgrade to it.
December 18, 2018
· 16,759 Views · 7 Likes
article thumbnail
Real-Time Stock Processing With Apache NiFi and Apache Kafka, Part 1
A big data expert starts his series on using Kafka and NiFi for real-time data flow programming.
November 20, 2018
· 42,147 Views · 15 Likes
article thumbnail
Simple Apache NiFi Operations Dashboard (Part 2): Spring Boot
In this post, we continue with uilding a dashboard with the open source big data platform Apache NiFi, using Spring Boot 2.0.6.
October 24, 2018
· 13,769 Views · 9 Likes
article thumbnail
Building a Custom Apache NiFi Operations Dashboard (Part 1)
Using NiFi and Spring Boot for operations to create custom dashboard for the data you use in your Apache NiFi application.
Updated October 23, 2018
· 19,630 Views · 14 Likes
article thumbnail
Properties File Lookup Augmentation of Data Flow in Apache NiFi 1.7.x
In this article, a big data expert goes over reading from properties files to use with Apache NiFi flows. Read on to get started!
October 9, 2018
· 6,236 Views · 2 Likes
article thumbnail
DevOps for Apache NiFi 1.7 and More
Learn how to better utilize command line tools, Python, and REST APIs for DevOps practices for Apache NiFi and NiFi registry.
July 30, 2018
· 15,114 Views · 3 Likes
article thumbnail
Convert JSON Data Files to Table DDL
In this post, we quickly introduce a new, open source processor for creating table definitions from JSON data files. Read on for more!
March 20, 2018
· 13,094 Views · 6 Likes

Refcards

Refcard #263

Messaging and Data Infrastructure for IoT

Messaging and Data Infrastructure for IoT

Refcard #204

Apache Spark

Apache Spark

Refcard #251

Introduction to TensorFlow

Introduction to TensorFlow

Trend Reports

Trend Report

Data Pipelines

Enter the modern data stack: a technology stack designed and equipped with cutting-edge tools and services to ingest, store, and process data. No longer are we using data only to drive business decisions; we are entering a new era where cloud-based systems and tools are at the heart of data processing and analytics. Data-centric tools and techniques — like warehouses and lakes, ETL/ELT, observability, and real-time analytics — are democratizing the data we collect. The proliferation of and growing emphasis on data democratization results in increased and nuanced ways in which data platforms can be used. And of course, by extension, they also empower users to make data-driven decisions with confidence.In our 2023 Data Pipelines Trend Report, we further explore these shifts and improved capabilities, featuring findings from DZone-original research and expert articles written by practitioners from the DZone Community. Our contributors cover hand-picked topics like data-driven design and architecture, data observability, and data integration models and techniques.

Data Pipelines

Trend Report

Development at Scale

As organizations’ needs and requirements evolve, it’s critical for development to meet these demands at scale. The various realms in which mobile, web, and low-code applications are built continue to fluctuate. This Trend Report will further explore these development trends and how they relate to scalability within organizations, highlighting application challenges, code, and more.

Development at Scale

Trend Report

Enterprise AI

In recent years, artificial intelligence has become less of a buzzword and more of an adopted process across the enterprise. With that, there is a growing need to increase operational efficiency as customer demands arise. AI platforms have become increasingly more sophisticated, and there has become the need to establish guidelines and ownership.In DZone's 2022 Enterprise AI Trend Report, we explore MLOps, explainability, and how to select the best AI platform for your business. We also share a tutorial on how to create a machine learning service using Spring Boot, and how to deploy AI with an event-driven platform. The goal of this Trend Report is to better inform the developer audience on practical tools and design paradigms, new technologies, and the overall operational impact of AI within the business.This is a technology space that's constantly shifting and evolving. As part of our December 2022 re-launch, we've added new articles pertaining to knowledge graphs, a solutions directory for popular AI tools, and more.

Enterprise AI

Trend Report

Machine Learning

Industry leaders discuss the latest trends in machine learning. We dive into using machine learning with microserivces, deploying machine learning models in real-life applications, and where the field is going over the next 12 months.

Machine Learning

Events

Watch

On Demand Event Thumbnail

Data Pipelines: Investigating the Modern Data Stack

Presenter: Decodeable & Informatica

Comments

Kafka Connectors Without Kafka

Jun 26, 2023 · Jordan Baker

Any updates to this since KRaFT?

Building Real-Time Weather Dashboards With Apache Pinot

Dec 12, 2022 · Tim Spann

https://github.com/tspannhw/pulsar-thermal-pinot/blob/main/weather.md

DevOps for Apache NiFi 1.7 and More

Jun 21, 2019 · Tim Spann

Put them into 2 docker nodes. Are you using https://hub.docker.com/r/apache/nifi-registry Is configuration right? It has to store data.https://nifi.apache.org/docs/nifi-registry-docs/html/getting-started.html


DevOps for Apache NiFi 1.7 and More

Jun 19, 2019 · Tim Spann

Upgrade to 1.9. Is it kerberized or have a login?


Real-Time Stock Processing With Apache NiFi and Apache Kafka, Part 1

Nov 27, 2018 · Tim Spann

See https://community.hortonworks.com/articles/227560/real-time-stock-processing-with-apache-nifi-and-ap.html

Source: https://community.hortonworks.com/storage/attachments/93299-stock-to-kafka.xml https://community.hortonworks.com/storage/attachments/93298-stocks-copy.json https://github.com/tspannhw/stocks-nifi-kafka

Install Java 8

Install NiFi https://nifi.apache.org/download.html

Install Kafka https://kafka.apache.org/downloads

Or get a linux box or big VM

https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.3.0/index.html

https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/index.html

How to Automatically Migrate All Tables From a Database to Hadoop With No Coding

Apr 09, 2018 · Tim Spann

we don't have one to list all topics. you could call kafka-topics.sh to get the list or make an API call

How to Automatically Migrate All Tables From a Database to Hadoop With No Coding

Apr 08, 2018 · Tim Spann

https://community.hortonworks.com/articles/57262/integrating-apache-nifi-and-apache-kafka.html


https://community.hortonworks.com/articles/155527/ingesting-golden-gate-records-from-apache-kafka-an.html


ConsumeKafkaRecord_1_0 (comma list of all topics) to ConvertRecord to PutHDFS you may add ConvertAvroToOrc or PutParquet


How to Automatically Migrate All Tables From a Database to Hadoop With No Coding

Apr 08, 2018 · Tim Spann

it has worked for me. post here https://community.hortonworks.com/gallery/index.html https://community.hortonworks.com/questions/1629/nifi-connection-to-mssql-server-db.html https://community.hortonworks.com/articles/87632/ingesting-sql-server-tables-into-hive-via-apache-n.html

Using Jolt in Big Data Streams to Remove Nulls

Mar 23, 2018 · Tim Spann

https://github.com/bazaarvoice/jolt/issues/130


I use default values. https://community.hortonworks.com/articles/149910/handling-hl7-records-part-1-hl7-ingest.html

Spring Boot 2.0 on ACID! Big Data + Spring Boot

Mar 14, 2018 · Tim Spann

good point, this was a follow up to the other article, should have had a review. sorry.

Spring Boot 2.0 on ACID! Big Data + Spring Boot

Mar 12, 2018 · Tim Spann

Josh Long is great, when I worked at Pivotal I got a few articles in the Spring Weekly list.

Spring Boot 2.0 on ACID! Big Data + Spring Boot

Mar 12, 2018 · Tim Spann

Spring for Hadoop hasn't been updated in forever. It is stuck on HDP 2.2 and we are on HDP 2.6. Spring Data JDBC and Spring Data Repositories make a lot of sense. I should do that, I'll do an update when I get the chance. Maybe add Java 9 and some other goodies. Thanks for the suggestions. If you want to fork the github repo, please do!

Apache Tika and Apache OpenNLP for Easy PDF Parsing and Munching

Feb 08, 2018 · Tim Spann

http://opennlp.sourceforge.net/models-1.5/

Apache Tika and Apache OpenNLP for Easy PDF Parsing and Munching

Feb 08, 2018 · Tim Spann

You need to install the OpenNLP models and reference that in the processor properties. Also OpenNLP misses a lot of names and locations. Accuracy is kind of hit or miss. https://github.com/tspannhw/nifi-nlp-processor https://community.hortonworks.com/articles/163776/parsing-any-document-with-apache-nifi-15-with-apac.html

What Is Kafka? Everything You Need to Know

Aug 13, 2017 · Jean-Paul Azar

Jean-Paul,

There's another open source registry that integrates with Kafka and other systems extremely well and has a great REST API and UI:

https://github.com/hortonworks/registry

It has versioning and is moving to adding protocol buffers and going into Apache.

Have you tried that one?

How to Automatically Migrate All Tables From a Database to Hadoop With No Coding

Jul 06, 2017 · Tim Spann

NiFi can run on a cluster of servers to distribute the load. NiFi generally supports 50 megabytes a second per node

16 Free and Open-Source Business Intelligence Tools

Apr 28, 2017 · Sarah Davis

have you seen the open source Superset from airbnb and hortonworks

TensorFlow on the Edge, Part 1 of 5

Mar 16, 2017 · Tim Spann

paused on that one will try this weekend



Yes, Java Has Flaws. But...

Apr 19, 2016 · Tim Spann

Some ways to do Java 8. https://dzone.com/articles/zlwell-written-java

Yes, Java Has Flaws. But...

Apr 15, 2016 · Tim Spann

i have seen a lot of messy code. You bring it into IntelliJ / Eclipse and format the code, hide the bad comments and run some static code analysis tools. Java is nice for that.

Adding EL support on your projects

Apr 15, 2015 · Mr B Loid

This article was from 2013. Those are good options too. This should probably be upgraded.

Adding EL support on your projects

Apr 15, 2015 · Mr B Loid

This article was from 2013. Those are good options too. This should probably be upgraded.

Adding EL support on your projects

Apr 15, 2015 · Mr B Loid

This article was from 2013. Those are good options too. This should probably be upgraded.

Adding EL support on your projects

Apr 15, 2015 · Mr B Loid

This article was from 2013. Those are good options too. This should probably be upgraded.

Learn All the NetBeans IDE Refactorings!

Nov 03, 2014 · Geertjan Wielenga

Have you seen Spring Data REST ?

Variable Scoping with gcc

Oct 02, 2014 · Mr B Loid

What about Gemfire XD?

http://www.pivotal.io/big-data/pivotal-gemfire-xd

What about Gemfire?

http://www.pivotal.io/big-data/pivotal-gemfire


Log Scraping

Jun 05, 2013 · Eric Gregory

Good catch. I like your second idea better though.

Log Scraping

Jun 05, 2013 · Eric Gregory

Good catch. I like your second idea better though.

Selenide in 15 seconds

Jun 03, 2013 · Tim Spann

Thanks! I thought I ran Organize Imports in Eclipse. I must not have.
Review of techniques of making navigation tabs with CSS - www.pagecolumn.com

Feb 02, 2013 · Ken Lee

Ours was dying on hotmail and gmail only. Turned out to be a mail server issue with one particular email address used for sending. Black list security issue.



/**
*
* @param subject
* @param message
* @param emailAddress
* @return
*/
public final static String sendEmail(String subject, String message, String emailAddress) {
// email sent
boolean emailSent = false;

// email message
StringBuilder emailLoggingMessage =
new StringBuilder(UserProvisioningUtility.gkBUFFER_SIZE);

// start of message
emailLoggingMessage
.append(Messages.getString("UserProvisioningProcess.SMTP_MESSAGE_LOG"))
.append(emailAddress).append(System.getProperty("line.separator"));

Email email = new SimpleEmail();
//email.setHostName("smtp.googlemail.com");
//email.setSmtpPort(465);
email.setHostName(Messages.getString("UserProvisioningProcess.EMAIL_SERVER"));

int smtpPort = 25;
if ( null != Messages.getString("UserProvisioningProcess.EMAIL_PORT")) {
try {
smtpPort = Integer.parseInt(Messages.getString("UserProvisioningProcess.EMAIL_PORT"));
}
catch(Throwable t) {
emailLoggingMessage.append(t.getLocalizedMessage()).append(
System.getProperty("line.separator"));
}
}
email.setSmtpPort(smtpPort);

// email.setAuthenticator(new DefaultAuthenticator("username", "password"));
// email.setSSLOnConnect(true);
try {
email.setFrom(Messages.getString("UserProvisioningProcess.FROM_EMAIL"));
} catch (EmailException e) {
emailLoggingMessage.append(e.getLocalizedMessage()).append(
System.getProperty("line.separator"));
}
email.setSubject(subject);
try {
email.setMsg(message);
} catch (EmailException e) {
emailLoggingMessage.append(e.getLocalizedMessage()).append(
System.getProperty("line.separator"));
}
try {
email.addTo(emailAddress);
} catch (EmailException e) {
emailLoggingMessage.append(e.getLocalizedMessage()).append(
System.getProperty("line.separator"));
}
try {
email.send();
emailSent = true;
} catch (EmailException e) {
emailLoggingMessage.append(e.getLocalizedMessage()).append(
System.getProperty("line.separator"));
}

// sent message
if (emailSent) {
emailLoggingMessage.append(
Messages.getString("UserProvisioningProcess.SMTP_MESSAGE_SENT"))
.append(System.getProperty("line.separator"));
}

return emailLoggingMessage.toString();
}

TOAD for Cloud Databases

Jul 31, 2012 · Eric Genesky

http://toadforcloud.com/entry.jspa?categoryID=677&externalID=4093 Get the community edition

User has been successfully modified

Failed to modify user

ABOUT US

  • About DZone
  • Send feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: