Overview Microsoft Azure is a cloud computing service created by Microsoft for building, testing, deploying, and managing applications and services through Microsoft-managed data centers. It provides software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS) and supports many different programming languages, tools, and frameworks, including both Microsoft-specific and third-party software and systems.
Integrations Azure offers a broad spectrum of services that generate observability data through various tools and platforms. Apica Ascent integrates these tools into a unified interface, allowing for streamlined monitoring and management.
See the sub-modules on this page for integrations for Azure enabled by Apica Ascent.
This guide take you through how you can forward your logs from an Azure Databricks cluster to Apica Ascent. Before you proceed with this setup, ensure that you meet the following prerequisites.
Private VNI
An Azure Databricks cluster in private VNI
Apica Ascent endpoint
To configure your Azure Databricks cluster to forward logs to your Apica Ascent endpoint, do the following.
Navigate to the Compute section on your Azure portal.
Click Create Cluster.
Choose your cluster size.
Click Advanced options >
Next, on the Azure portal, under Network security group, add port 2200 in the Inbound ports section for the machines that the Databricks cluster spun up.
To install and configure Fluent Bit on your Databricks cluster, do the following.
Log into the machine using the following command.
Install Fluent Bit as per the version of Ubuntu OS running on the machine. For detailed installation instructions, refer to the .
Use the following Fluent Bit configuration file.
In the Fluent Bit configuration file above, substitute the following details based on your implementation.
ascent-endpoint
TOKEN
Now, when you log into your Apica Ascent UI, you should see the logs from your Azure Databricks cluster being ingested. See the Section to view the logs.
ssh-keygen -t rsa -b 4096 -C "email-id”Databricks-worker
Next, replace the existing configuration at /etc/td-agent-bit/td-agent-bit.conf with the modified file.
Finally, restart Fluent Bit by running the following command.
ssh ubuntu@machine-ip -p 2200 -i <private_key_file_path>[SERVICE]
Flush 1
Parsers_File /etc/td-agent-bit/parsers.conf
Log_Level debug
[INPUT]
Name tail
Path /dbfs/cluster-logs/*/driver/stdout*
Tag driver-stdout
Buffer_Max_Size 1MB
Ignore_Older 5m
[INPUT]
Name tail
Path /dbfs/cluster-logs/*/driver/*.log
Tag driver-log4j
Buffer_Max_Size 1MB
Ignore_Older 5m
[INPUT]
Name tail
Path /dbfs/cluster-logs/*/driver/stderr*
Tag driver-stderr
Buffer_Max_Size 1MB
Ignore_Older 5m
[INPUT]
Name tail
Path /dbfs/cluster-logs/*/eventlog/*/*/eventlog
Tag eventlog
Buffer_Max_Size 1MB
Ignore_Older 5m
[INPUT]
Name tail
Path /dbfs/cluster-logs/*/executor/*/*/stdout*
Tag executor-stdout
Buffer_Max_Size 1MB
Ignore_Older 5m
[INPUT]
Name tail
Path /dbfs/cluster-logs/*/executor/*/*/stderr*
Tag executor-stderr
Buffer_Max_Size 1MB
Ignore_Older 5m
[FILTER]
Name record_modifier
Match driver-stdout
Record AppName driver-stdout
[FILTER]
Name record_modifier
Match eventlog
Record AppName eventlog
[FILTER]
Name record_modifier
Match driver-stderr
Record AppName driver-stderr
[FILTER]
Name record_modifier
Match driver-log4j
Record AppName driver-log4j
[FILTER]
Name record_modifier
Match executor-stdout
Record AppName executor-stdout
[FILTER]
Name record_modifier
Match executor-stderr
Record AppName executor-stderr
[FILTER]
Name record_modifier
Match *
Record cluster_id Linux
Record linuxhost ${HOSTNAME}
Record namespace Databrick-worker
[FILTER]
Name modify
Match *
Rename ident AppName
Rename procid proc_id
Rename pid proc_id
[FILTER]
Name parser
Match *
Key_Name data
Parser syslog-rfc3164
Reserve_Data On
Preserve_Key On
[OUTPUT]
name stdout
match *
[OUTPUT]
name http
match *
host <ascent endpoint>
port 443
URI /v1/json_batch
Format json
tls on
tls.verify off
net.keepalive off
compress gzip
Header Authorization Bearer <TOKEN>systemctl restart td-agent-bitAzure Events Hubs is a big data streaming platform and event ingestion service capable of receiving and processing millions of events per second. You can integrate your Apica Ascent instance into your event hubs to transform, analyze, and store data received by them.
Setting up data ingestion from your Azure Event Hubs into Apica Ascent involves the following steps.
Creating an Azure storage account.
Creating an Event Hubs namespace and event hub
Configuring Logstash to forward logs to your Apica Ascent instance
To create an Azure storage account, do the following.
Log into your Azure portal and select Storage accounts.
On the Storage accounts page, click Create.
Under Project details on the Basics tab, select the Subscription and Resource group for this new storage account.
An Event Hubs namespace provides a unique scoping container within which you can create one or more event hubs. To create an Event Hubs namespace and an event hub within it, do the following.
On your Azure portal, click Create a resource > All services > Event Hubs > Add.
Under the Project Details on the Basics tab of the Create Namespace page, select the Subscription and Resource group for this new Event Hubs namespace.
Under Instance details, provide a Namespace name, select a
The final step is configuring Logstash to forward event logs from your Azure Event Hub to Apica Ascent. Download and store the followingflattenJSON.rb file. We will use this file while configuring Logstash.
Copy the following Logstash configuration and edit the fields listed in the table below the code.
Apica Ascent App extension for Azure Eventhub provides an easy way to pull data from Azure Eventhub
Have the following information ready to configure your Azure Eventhub App extension
Connection string
Eventhub Instance name


Under Instance details, set a unique Storage account name and an appropriate Region.
Click Review + create.
Once the storage account is created, navigate to the Access Keys section.
Click Show keys.
Note down the Key and Connection string under key1.
Click Review + create.
Review the configuration, click Create and wait for the namespace to be created.
After the namespace is created, click Go to resource.
Select Event Hubs in the left menu on the namespace page and then click + Event Hub.
Provide a Name for the event hub.
Set Partition Count and Message Retention to 1.
Set Capture to On.
Set Time window (minutes) to 5.
Set Size window (MB) to 300.
Under Capture Provider, select Azure Storage Account.
Click Select Container and select the storage account you created in the previous step.
Click Create.
After the Event Hub is created, navigate to Shared Access Policies.
Select your shared access policy and note down the Primary key and Connection string.
<Path_to_flattenJSON.rb>
Local file path where you saved the flattenJSON.rb file you downloaded.
<APICA_endpoint>
Your Apica Ascent instance endpoint
<APICA_ingest_token>
Your Apica Ascent ingest token
input {
azure_event_hubs {
event_hub_connections => ["<Event hub connection string"]
threads => 5
decorate_events => true
storage_connection => "<Storage connector configuration>"
initial_position => "look_back"
initial_position_look_back => 72000
}
}
output { stdout { codec => rubydebug } }
filter {
json {
source => "message"
remove_field => "message"
}
split { field => "records" }
date {
match => ["[records][time]", "ISO8601"]
target => "@timestamp"
}
ruby {
path => "<Path_to_flattenJSON.rb>"
script_params => { "field" => "records" }
}
mutate {
split => { "records.RoleLocation" => " " }
join => { "records.RoleLocation" => "-" }
add_field => { "namespace" => "events" }
add_field => { "proc_id" => "%{[records.category]}" }
add_field => { "cluster_id" => "azure" }
add_field => { "app_name" => "activity-logs" }
add_field => { "message" => "%{[records.operationName]}" }
remove_field => [ "records.message", "records.time" ]
}
}
output {
http {
url => "http://<ascent_endpoint>/v1/json_batch"
headers => { "Authorization" => "Bearer <ascent_ingest_token> " }
http_method => "post"
format => "json_batch"
content_type => "json_batch"
pool_max => 2000
pool_max_per_route => 100
socket_timeout => 300
}
}