A domain controller is the first server most organizations deploy in IaaS as they move workloads to Azure.
In a Microsoft Entra ID (formerly known as Azure AD) Passthrough Authentication scenario, the on premises Domain controller is a single point of failure for each O365 authentication request. So is the Microsoft Entra ID (formerly known as Azure AD) Connect server. If either service is DOA, users won’t be able to sign in to Microsoft Entra ID (formerly known as Azure AD) or Office 365.
Passthrough authentication’s flow goes from Azure AD > Internet > Azure AD Connect Pass-through Auth Agent > AD Domain Controller, then backwards. Any one of those components can be a single point of failure, but all can be setup for resiliency with high availability and/or DR.
To guard against an outage of the entire data center or its Internet connection, put a Domain Controller in Azure. This way if anything happened on-premises, the Azure and Office 365 environments would still be fully functional (assuming users have Internet access).
Choose carefully between AAD Password Sync, Passthrough, and ADFS. See an excellent primer on identity and Microsoft Entra ID (formerly known as Azure AD) architecture from my colleague Andy Nelson. When using Microsoft Entra ID (formerly known as Azure AD) with Password Sync, there is less reliance in real-time on the authentication sequence on the domain controller. In that case, the source of authority is Microsoft Entra ID (formerly known as Azure AD). The on-premises environment is simply responsible for syncing the password and attributes of objects. If the entire on-premises environment goes down (AD, AADC, etc) users will still be able to authenticate to Microsoft Entra ID (formerly known as Azure AD) with their most recently synced password as long as they have Internet access. If AD and AADC are down, Microsoft Entra ID (formerly known as Azure AD) simply won’t receive any attribute or password updates until back online.
These rough numbers are provided to understanding some of the fixed and variable costs of running VMs in Azure.
In summary, that’s how you can approximate ~$300/mo for the low end A2_V2 series VM with 2vCPUs, 4 GB RAM, with two managed disks, vnet, and vnet gateway for site-to-site VPN running full time.
Communication between domain controllers on premises and in Azure IaaS use Active Directory Replication, over the VPN mentioned earlier. Replication uses Remote Procedure Call (RPC) over IP for replication within a site, typically called IP Site Links. You can use SMTP as well, but that is much less common. There are other means of communication, but as long as each DC has the latest replication they can act fully independent of other DCs and sites.
Then, if the DCs on premises at HQ or at the data center are unavailable, users can still log in using an Azure AD Passthrough scenario.
If you haven’t read Wired’s cover story of how the NotPetya ransomware infected all Domain Controllers at global shipping company Maersk, it’s a great read. It explains how all DCs were locked in rapid succession, and because Maersk relied on replication as their only failover between sites, they were completely out of business until they rehydrated their AD.
A recommended way to prevent what happened to Maersk with today’s technologies would have been using Azure Site Recovery (ASR) with Azure Backup as a fail-safe. ASR would have given them a minimum 24-hour window (72 with standard disks with VMWare) to recover. Azure Backup would have itself been a backup to ASR if that window wasn’t met. There are other options as well, such as virtual cloning and keeping that copy offline, but the options in Azure can deliver a reliable, fast RTO.
Microsoft’s got a great blog outlining how to get started, and you can contact eGroup | Enabling’s engineers.
Work with our team of Cloud Computing Consultants who have done this so many times they know all of the “minefields” to prevent missteps.