DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Enterprise AI Trend Report: Gain insights on ethical AI, MLOps, generative AI, large language models, and much more.

2024 Cloud survey: Share your insights on microservices, containers, K8s, CI/CD, and DevOps (+ enter a $750 raffle!) for our Trend Reports.

PostgreSQL: Learn about the open-source RDBMS' advanced capabilities, core components, common commands and functions, and general DBA tasks.

AI Automation Essentials. Check out the latest Refcard on all things AI automation, including model training, data security, and more.

Related

  • PostgresML: Streamlining AI Model Deployment With PostgreSQL Integration
  • Extensive React Boilerplate to Kickstart a New Frontend Project
  • Advanced Brain-Computer Interfaces With Java
  • Real-Time Data Architecture Frameworks

Trending

  • WebSocket vs. Server-Sent Events: Choosing the Best Real-Time Communication Protocol
  • Understanding Escape Analysis in Go
  • Dapr For Java Developers
  • 6 Agile Games to Enhance Team Building and Creativity
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Securing Web User Interfaces of Cloudera Data Platform (CDP) Services via Apache Httpd Reverse Proxy

Securing Web User Interfaces of Cloudera Data Platform (CDP) Services via Apache Httpd Reverse Proxy

How to use Apache HTTPD server as a Reverse Proxy to secure web user interface of Cloudera Data Platform (CDP) services.

By 
Babul Bansal user avatar
Babul Bansal
·
Jan. 16, 24 · Tutorial
Like (1)
Save
Tweet
Share
1.8K Views

Join the DZone community and get the full member experience.

Join For Free

It’s an HTTP Server built by Apache Foundation. HTTP Stands for Hypertext Transfer Protocol, which decodes Hypertext and Multimedia documents through a server-side program. An HTTP daemon (background process) program runs and serves the requests from any HTTP client like a Web Browser. It is important to note that Apache HTTP Server can only serve static content like text or media that doesn’t change during the web page loading. To serve the dynamic content via scripts technologies/protocols like Common Gateway Interface (CGI), Java Server Pages (JSP), etc. are being used.

What Are Apache Modules?

As stated above, Apache HTTP Server is a basic web server that can be used to serve non-dynamic content. Still, it also doesn’t provide any functionalities like Authentication, Encryption of requests, Logging, Support, SSL, Heartbeat, LDAP, Caching, etc. So, it provides special program modules to extend the core Apache HTTP Server's functionality.

What Are Apache mod_ldap Module and mod_authnz_ldap.so Module?

Lightweight Directory Access Protocol (LDAP) is used to store the database of principals (users, organizations, functional IDs, service accounts, etc.). It has a server that supports LDAP. mod_ldap is an Apache module that provides the core functionality of LDAP to the Apache Server. Similarly, mod_authnz_ldap is another module that allows the LDAP directory to store the database for basic authentication for these principals.

What Is the Problem With Cloudera Data Platform Web Services?

Cloudera Data Platform (CDP) is a Big Data Platform that provides open source services like Hadoop, YARN, Spark, Impala, HUE, Hive, Kafka, NiFi, etc., for Data Warehousing, Real-Time Data Processing, Data Analytics, Machine Learning, Scalability, Security and a lot of other advances features. A lot of these services expose web user interfaces (Web UIs), which provide static content but are not controlled by any authentication mechanisms whatsoever and hence are against the overall security governance best practices.

How Apache Modules Can Help

Apache LDAP modules help control the authentication of these services and also ensure that access to these Web UIs is done by authorized individuals only.  Below are some of the best practices that need to be used to secure these Web UIs.

  • For any existing CDP service, identify the port or list of ports where the Web UI can be accessed.
  • Ensure that all these Web UIs are SSL/TLS enabled.
  • Add the authentication using Apache HTTP modules to enable the authentication and authorization.

Demonstration With Code

Let us enable the authentication for the Hadoop Datanode Web User Interface. Datanode UI works on the port number 50075, and when SSL/TLS is enabled, the port number changes to 50470.

The process to enable it is done via reverse proxy, which is done by installing and configuring Apache HTTP Server, Apache LDAP modules mentioned above, and enabling software firewalls (IPTables).

Operating System: RedHat Enterprise Linux/CentOS 7.9

Cloudera Data Platform: v7.x 


Below are the steps to enable this.

  1. Install the httpd server.

yum -y install httpd

  1. Install the ldap_mod packages required for authentication.

    yum -y install ldap_mod
  1. Install the iptables services package

yum -y install iptables-services

  1. Create a file and give it any name, we are calling it datanode.conf. The contents of this file are mentioned below. Place this file in directory /etc/httpd/conf.d/

Plain Text
 
#Port for Datanode Web User Interface
Listen 50471

SSLPassPhraseDialog exec:/usr/libexec/httpd-ssl-pass-dialog
SSLSessionCache         shmcb:/run/httpd/sslcache(512000)
SSLSessionCacheTimeout  300
SSLRandomSeed startup file:/dev/urandom  256
SSLRandomSeed connect builtin
SSLCryptoDevice builtin

ErrorLog logs/ssl_error_log
TransferLog logs/ssl_access_log
LogLevel debug

SSLProtocol all -SSLv2 -SSLv3
SSLCipherSuite HIGH:3DES:!aNULL:!MD5:!SEED:!IDEA
SSLCertificateFile <LOCATION OF CERTIFICATE>
SSLCertificateKeyFile <LOCATION OF CERTIFICATE'S PRIVATE KEY IN .pem FORMAT>
ServerSignature Off
ServerTokens Prod

<Directory "/var/www/cgi-bin">
    SSLOptions +StdEnvVars
</Directory>

CustomLog logs/ssl_request_log \
          "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"

<AuthnProviderAlias ldap ad-ldap>
    AuthLDAPBindAuthoritative on
    AuthLdapUrl "ldaps://ldaps.<LDAP SERVER ADDRESS>:636/OU=YOUR BUSINESS UNIT,DC=BUSINESS,DC=COUNTRY?sAMAccountName?sub?(objectClass=*)"
    AuthLDAPBindDN "CN=<BINDING USER WHICH WILL AUTHENTICATE AGAINST AD>,OU=<BINDING USER'S OU>,dc=<BUSINESS>,DC=<COUNTRY>"
    AuthLDAPBindPassword "<BINDING USER'S PASSWORD>"
</AuthnProviderAlias>

<AuthzProviderAlias ldap-group ldap-group-<AD GROUP NAME> "CN=<AD GROUP NAME>,OU=<AD GROUP'S OU>,DC=<BUSINESS>,DC=<COUNTRY>">
    AuthLdapUrl "ldaps://ldaps.<LDAP SERVER ADDRESS>:636/OU=<YOUR BUSINESS UNIT>,DC=<BUSINESS>,DC=<COUNTRY>?sAMAccountName?sub?(objectClass=*)"
    AuthLDAPBindDN "CN=<BINDING USER WHICH WILL AUTHENTICATE AGAINST AD>,OU=<BINDING USER'S OU>,dc=<BUSINESS>,DC=<COUNTRY>"
    AuthLDAPBindPassword "<BINDING USER'S PASSWORD>"
    AuthLDAPGroupAttribute member
    AuthLDAPGroupAttributeIsDN on
    AuthLDAPMaxSubGroupDepth 0
</AuthzProviderAlias>

<VirtualHost *:50471>
SSLEngine on
SSLProxyEngine on
SSLProxyCheckPeerCN off
SSLProxyCheckPeerExpire off
RewriteEngine on
RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK) 
RewriteRule .* - [F]

<Location "/">
    LDAPReferrals off
    AuthType Basic
    AuthName "<YOUR MESSAGE TO BE DISPLAYED WHEN USER OPEN'S THE UI>"
    AuthBasicProvider ad-ldap

    <RequireAny>
         Require ldap-group-<AD GROUP NAME>
    </RequireAny>

    ProxyPreserveHost On
    ProxyPass https://<DATANODE IP ADDRESS OR HOSTNAME>:50470/
    ProxyPassReverse https://<DATANODE IP ADDRESS OR HOSTNAME:50470/
</Location>
</VirtualHost>


  1. Create a new file for iptables. Give it any name; we are calling it iptables_config. The contents of this file are mentioned below. Place this file in the directory /etc/sysconfig/iptables/

Plain Text
 
# Configuration for iptables service
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]

#Accept the traffic from which all IP Addresses
-A INPUT -s <ADD YOUR SUBNET> -p tcp -m tcp --dport 50470 -j ACCEPT
#Drop the traffic on port 50470
-A INPUT -p tcp -m tcp --dport 50470 -j DROP
COMMIT


  1. We need to load the LDAP modules now in the httpd directory. Create a new file and add the below contents. Name the file as 01-ldap.conf. The directory to be placed in /etc/httpd/conf.modules.d/   

Plain Text
 
LoadModule authnz_ldap_module modules/mod_authnz_ldap.so
LoadModule ldap_module modules/mod_ldap.so


  1. After the above changes are made, restart the HTTPD and iptables services; the commands below can be used to do the same.

Plain Text
 
sysconfig iptables restart
sysconfig httpd restart

Once the services are restarted, Datanode WebUI will start running on port 50471, and opening its WebUI will require authentication, which is your LDAP user ID and password.

Limitations of This Security Mechanism

Apache HTTPD Reverse Proxy implementation doesn’t support Kerberos authentication. Kerberos is a service that allows users and services of the platform to authenticate each other. So, if you have Kerberos service enabled for Web UIs, then the above implementation will not work.

Apache HTTP Server Big data Machine learning UI authentication Interface (computing)

Opinions expressed by DZone contributors are their own.

Related

  • PostgresML: Streamlining AI Model Deployment With PostgreSQL Integration
  • Extensive React Boilerplate to Kickstart a New Frontend Project
  • Advanced Brain-Computer Interfaces With Java
  • Real-Time Data Architecture Frameworks

Partner Resources


Comments

ABOUT US

  • About DZone
  • Send feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: