Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The Microsoft Sentinel data lake mirrors data from Microsoft Sentinel workspaces. When you onboard to Microsoft Sentinel data lake, your existing Microsoft Sentinel data connectors are configured to send data to both the analytics tier - your Microsoft Sentinel workspaces, and mirror the data to the data lake tier for longer term storage. After onboarding, configure your connectors to retain data in each tier according to your requirements.
This article explains how to set up connectors for the Microsoft Sentinel data lake and configure retention. For more information on onboarding, see Onboarding to Microsoft Sentinel data lake.
Configure retention and data tiering
After onboarding, you can enable new connectors and configure retention for existing connectors. You can choose to send the data to the analytics tier and mirror the data to the data lake tier or send the data only to the data lake tier. You manage retention and tiering from the connector setup pages, or by using the Table management page in the Defender portal. For more information on table management and retention, see Manage data tiers and retention in Microsoft Defender portal.
When you enable a connector, by default the data is sent to the analytics tier and mirrored in the data lake tier. When you enable Microsoft Sentinel data lake, the mirroring is automatically enabled for all the tables from onboarding forward. Mirrored data in the data lake with the same retention as the analytics tier doesn't incur extra billing charges. Preexisting data in the tables isn't mirrored. The retention of the data lake tier is set to the same value as the analytics tier. You can switch to ingest data to data lake tier only. When you configure to ingest only to the data lake tier, ingestion to the analytics tier stops and the existing data in the analytics tier is retained according to the retention settings.
The data retained in Archive is still available and can be restored by using Search and Restore functionality.
To configure retention and tiering for the data connector see Configure data connector.
Microsoft Sentinel XDR data
By default, Microsoft Defender XDR retains threat hunting data in the Analytics tier for 30 days. This data is always available. Some XDR tables can be ingested into the analytics and data lake tiers by increasing the retention time to more than 30 days. You can also ingest XDR data directly into the data lake tier without the analytics tier. For more information, see Manage XDR data in Microsoft Sentinel.
Custom log tables
Microsoft Monitoring Agent(MMA) and Log analytics Agent (CLV1) custom tables aren't mirrored to the data lake.
Tables created by using the Logs Ingestion API or Azure Monitor Agent (AMA) and DCR-based custom tables are mirrored. For more information, see Logs Ingestion API in Azure Monitor.
Auxiliary log tables
When you onboard to both Microsoft Defender and Microsoft Sentinel and then onboard to the data lake, you no longer see auxiliary log tables in Microsoft Defender’s Advanced hunting or in the Microsoft Sentinel Azure portal. The auxiliary table data is available in the data lake and you can query it by using KQL queries or Jupyter notebooks. Find KQL queries under Microsoft Sentinel > Data lake exploration in the Defender portal.
Direct ingestion to the data lake tier
Depending on your organization's security needs, you might choose to ingest some log sources directly into the data lake. Directly ingesting logs to the data lake allows you to better manage costs by optimizing data retention and storage based on the value of the data for real-time detection versus long-term analysis.
Ingest high-volume logs that are less critical for real-time detection but valuable for deep analysis and forensics directly to the lake, and ingest only high-value logs to the analytics tier. Note that logs ingested to the analytics tier are also mirrored to the data lake.
Use the following table to prioritize which sources you should ingest directly to the data lake versus the analytics tier.
| Log source type | Typical log volume | Value for real-time threat detection and alerting | Value for threat hunting | Value for incident investigation and forensics | Ingest to data lake |
|---|---|---|---|---|---|
| AAA (TACACS/Radius) | Medium | High | High | High | Yes |
| Active Directory (on-premises) | High | High | High | High | No |
| Application Logs | High | Medium | Medium | High | Yes |
| AV Logs (Windows Events 5000s & 3rd party) | Medium | High | High | High | No |
| Azure Activity | Medium | High | High | High | No |
| Biometric Access System Logs | Low | Medium | Low | High | Yes |
| Building Security System Logs | Low | Low | Low | Medium | Yes |
| Call Center/VoIP Logs | Medium | Low | Low | Medium | Yes |
| CASB | High | High | High | High | Yes |
| Citrix/Horizon/ALBs | Medium | Medium | Medium | High | Yes |
| Cloud IAM | Medium | High | High | High | No |
| Cloud PaaS | High | High | High | High | Yes |
| Cloud Security Controls | Medium | High | Medium | High | No |
| Cloud Storage (S3, Blob, etc.) Logs | High | High | High | High | No |
| CRM Audit Logs | Low-Medium | Low | Low | Medium | Yes |
| Database Audit Tools | Medium | High | High | High | Yes |
| DHCP Logs | Medium | Medium | Medium | High | Yes |
| DLP Alerts | Low | High | High | High | Yes |
| DNS Logs | High | High | High | High | Yes |
| Endpoint Detection and Response (EDR) (Alerts) | Medium | High | High | High | No |
| Endpoint Detection and Response (EDR) (Raw) | High | High | High | High | Yes |
| Email Security (3rd party alerts) | Medium | High | Medium | High | No |
| ERP Audit Logs | Low-Medium | Low | Low | Medium | Yes |
| File Integrity | Low | Medium | Medium | High | Yes |
| Firewall Threat/Malware/IPS/IDS | High | High | High | High | No |
| Firewall Traffic Logs | High | High | High | High | Yes |
| GitHub/GitLab/Code Repo Logs | Low-Medium | Medium | Medium | High | Yes |
| Google Workspace Logs | Medium | Medium | Medium | High | Yes |
| Identity (Entra ID, Okta, LDAP) | Medium | High | High | High | No |
| IIS/Apache Logs | Medium | High | High | High | Yes |
| IoT Device Logs | High | Medium | Medium | Medium | Yes |
| Kubernetes/Container Logs (alerts, critical) | High | High | High | High | No |
| Kubernetes/Container Logs (raw logs) | High | High | High | High | Yes |
| LAN/WAN Router Switch | High | Medium | Medium | Medium | Yes |
| Linux Server AuditD | Medium | High | High | High | No |
| Mobile Device Management (Intune) | Medium | Medium | Medium | Medium | Yes |
| Microsoft Office Logs (Teams, Office, SharePoint) | Medium | Medium | Medium | High | No |
| Microsoft XDR Alerts (Defender: Office, Identity, Endpoint, CloudApp) | Medium | High | High | High | No |
| Multifactor authentication (MFA) | Medium | High | Medium | High | No |
| Netflow | High | Medium | High | Medium | Yes |
| Network Detection (Corelight, Vectra, Darktrace) | High | High | High | High | No |
| OT/ICS System Logs | Medium | High | High | High | Yes |
| PAM (Privileged Access Management) | Low | High | High | High | No |
| PIM (Privileged Identity Management) | Low | High | High | High | No |
| POS System Logs | High | High | High | High | Yes |
| Proxy Logging (URL filtering) | High | High | High | High | Yes |
| Salesforce Audit Logs | Medium | Medium | Medium | High | Yes |
| SD-WAN | Medium | Medium | Medium | Medium | Yes |
| ServiceNow Audit Logs | Low | Low | Low | Medium | Yes |
| SIEM/SOAR Platform Logs | Medium | High | High | High | No |
| Slack/Teams Collaboration Logs | Medium | Low | Medium | Medium | Yes |
| Sysmon (Endpoint, for EDR complement) | Medium | High | High | High | Yes |
| Threat Intelligence Indicators | Low | High | High | High | No |
| VDI Logs | Medium | Medium | Medium | High | Yes |
| VPN | Medium | High | High | High | No |
| Vulnerability Scanning | Low | Medium | Medium | Medium | Yes |
| Web Application Firewall (WAF) Logs | Medium | High | High | High | Yes |
| Windows Server Events | High | High | High | High | No |
| XDR Source Logs (Defender: Office, Identity, Endpoint, CloudApp) | Medium | High | High | High | No |
| Zoom Meeting Logs | Low-Medium | Low | Low | Medium | Yes |