Download presentation
Presentation is loading. Please wait.
Published byColeen Wright Modified over 8 years ago
1
Ing. Ondřej Ševeček | GOPAS a.s. | MCM: Directory Services | MVP: Enterprise Security | ondrej@sevecek.com | www.sevecek.com |
2
Active Directory Replication Issues and Troubleshooting
3
Central Database LDAP – Lightweight Directory Access Protocol database query language, similar to SQL TCP/UDP 389, SSL TCP 636 Global Catalog (GC) – TCP/UDP 3268, SSL TCP 3269 D/COM Dynamic TCP – Replication D/COM Dynamic TCP – NSPI Kerberos UDP/TCP 88 Windows NT 4.0 SAM SMB/CIFS TCP 445 (or NetBIOS) password resets, SAM queries SMB/DCOM Dynamic TCP NTLM pass-through Kerberos PAC validation
4
Design Considerations Distributed system DCs disconnected for very long times several months Multimaster replication with some FSMO roles
5
Design Considerations Example: Caribean cruises, DC/IS/Exchange on board with tens of workstations and users, some staff hired during journey. No or bad satelite connectivity only. DCs synced after ship is berthed at main office. Challenge: Must work independently for long time periods. Different independent cruise- liners/DCs can accomodate changes to user accounts, email addresses, Exchange settings. Cannot afford lost of any one.
6
Database Microsoft JET engine JET Blue common with Microsoft Exchange used by DHCP, WINS, COM+, WMI, CA, CS, RDS Broker %WINDIR%\NTDS\NTDS.DIT ESENTUTL Opened by LSASS.EXE
7
Installed services LSASS Security Accounts Manager TCP 445 SMB + Named Pipes Kerberos Key Distribution Center UDP, TCP 88 Kerberos Active Directory Domain Services UDP, TCP 389 LDAP NTDS.DIT D/COM Dynamic TCP
8
Installed services LSASS SAM KDC NTDS TCP 445 SMB + Named Pipes UDP, TCP 88 Kerberos UDP, TCP 389,... LDAP NT4.0 NTLM Pass-through PAC validation Windows 2000+ LDAP/ADSI Client NTDS Replication FIM/DRS API Client Connect to domain D/COM Dynamic TCP
9
Uninstallation DCPROMO requires working replication connectivity with other DCs DCPROMO /forceremoval does not access network at all can run in DS Restore Mode
10
NTDSUTIL Metadata Cleanup Connection Connect to server srv2.idtt.local Quit Select operation target List sites Select site 0 List domains in site Select domain 0 List servers in site Select server 0 Quit Remove selected server
11
Metadata Cleanup
12
Active Directory Replication Issues and Troubleshooting
13
Knowledge Consistency Checker (KCC) runs 5 minutes after boot Repl topology update delay (secs) runs every 15 minutes periodically Repl topology update period (secs)
14
Intrasite Replication Topology DC1 DC2 DC4 DC3
15
Originating Updates and Notifications DC1 DC2 DC4 DC3 15 sec 3 sec
16
Notification and Replication DC1DC2 I have got some changes Kerberos Authenticated DCOMTCPRandom Give me your replica Kerberos Authenticated DCOMTCPRandom
17
Intrasite Replication – 3 Hops max. DC1 DC4 DC3 DC5 DC6 DC7 DC2
18
Intersite Replication (no Bridgeheads) DC1 DC2 DC3 DC5 DC6 DC7 DC4
19
Intersite Replication (no Bridgeheads) DC1 DC2 DC3 DC5 DC6 DC7 DC4 15 sec 3 sec schedule
20
Intersite Replication with a Bridgehead DC1 DC2 DC3 DC5 DC6 DC7 DC4 15 sec 3 sec schedule
21
Intrasite Replication Uses notifications by default (originating/received) 300/30 sec on Windows 2000 15/3 sec on Windows 2003 Occurs every hour as scheduled nTDSSiteSettings At this frequency KCC detects unavailable partners HKLM\System\CCS\Services\NTDS\Parameters Replicator notify pause after modify (secs) Replicator notify pause between DSAs (secs)
22
Intrasite Replication DC1 DC2 notification random TCP download changes random TCP 15 sec download changes random TCP schedule
23
Intersite Replication DC1 DC2 download changes random TCP schedule
24
Intersite Replication Does not use notifications by default siteLink: options = USE_NOTIFY (1) Compression used siteLink: options = DISABLE_COMPRESSION (4) Bridge all site links
25
Site Link Design
26
Site Link Design (Better?) London Olomouc Roma Cyprus Paris Berlin
27
Site Link Design (Worse?) Olomouc Roma Cyprus Paris Berlin London
28
Static TCP for Replication HKLM\System\CurrentControlSet\Services NTDS\Parameters TCP/IP Port = DWORD Replication + NSPI Netlogon\Parameters DCTcpipPort = DWORD LSASS (Pass-through) NTFRS\Parameters RPC TCP/IP Port Assignment = DWORD DFSRDIAG StaticRPC /port:xxx /Member:dc1
29
Urgent Replication (Notification) Intrasite only intersite also if notification enabled Do not wait for delay (15/3 sec) In the case of account lockout password and lockout policy RID FSMO owner change DC password or trust account password change
30
Immediate Replication (Notification) Password changes from DCs to PDC Regardless of site boundaries PDC downloads only the single user object all changed attributes but only single object From DC/PDC further with normal replication
31
Example Replication Traffic Atomic replication of a single object with a one byte attribute change Notification + replication intersite compressed Overall 7536 B 30 packets ~10 round trips 50 ms round trip means 500 ms transfer time consumption at 120 kbps Useful data ~80 B
32
Bridge All Site Links On Olomouc London Prague Paris Roma Cyprus B BA site links are transitive can be disabled on IP transport A A A A
33
Bridge All Site Links Off Olomouc London Prague Paris Roma Cyprus A A site links are not transitive Cyprus partition is cut off A A A B B
34
GC Replication Olomouc London Prague Paris Roma Cyprus A A A A A one-way: from the source NC into the nearest GC two-way: GCs between themselves B GC
35
Roma London GC Replication Olomouc Prague Paris Cyprus A A A A B A B one-way: from the source NC into the nearest GC two-way: GCs between themselves GC
36
Subnetting in AD (Apps) 10.10.x.x / 16 10.10.0.248 / 29 DC1 DC2 DC3DC4 DC5 Exchange
37
Subnetting in AD (Recovery) 10.10.x.x / 16 Recovery Site 10.10.0.7 / 32 DC1 DC2 DC3DC4 DC5
38
Rebuilding After Failure
39
Inter-site IntersiteFailuresAllowed MaxFailureTimeForIntersiteLink (secs) Intra-site (immediate neighbors) CriticalLinkFailuresAllowed MaxFailureTimeForCriticalLink Intra-site (optimalization for non-critical) NonCriticalLinkFailuresAllowed MaxFailureTimeForNonCriticalLink
40
Active Directory Replication Issues and Troubleshooting
41
Modification operations Create new object Modify attributes change/delete value change distinguishedName = rename Rename container all subobjects renamed as well
42
Replication Metadata REPADMIN /ShowObjMeta all attributes when originating DC
43
Replication conflicts The later action wins if no one is later then random (USN) Attribute modified on two DCs “simultaneously” only one change wins Linked multivalue attribute modified merged (on 2003+ forest level) Object/container deleted and object modified deleted Object moved into a deleted container CN=lost and found Two objects with the same sAMAccountName, cn or userPrincipalName created object renamed, logins duplicit
44
Linked Multi-values
45
DC1 Replication Kamil10:00 Helen11:00 DC2 DC19:00 11:05
46
DC1 Replication Basics Kamil10:00 Helen11:00 DC2 DC111:30 Kamil10:00 Helen11:00 11:30
47
DC1 Replication Basics Kamil10:00 Helen11:00 DC2 DC111:30 Kamil10:00 Helen11:00 Judith12:00 12:05
48
DC1 Replication Basics Kamil10:00 Helen11:00 DC2 DC112:30 Kamil10:00 Helen11:00 Judith12:00 Judith12:00 12:30
49
DC1 Replication Basics Kamil10:00 Helen11:00 DC2 DC112:30 Kamil10:00 Helen11:00 Judith12:00 Judith12:00 DC1 DC3 Marie11:00 Me 12:30
50
DC1 Replication Basics Kamil10:00 Helen11:00 DC2 DC112:30 Kamil10:00 Helen11:00 Judith12:00 Judith12:00 DC1 DC3 DC110:30 DC27:00 Kamil10:00 DC1 Marie11:00 Me 12:30
51
DC1 Replication Basics Kamil10:00 Helen11:00 DC2 DC112:30 Kamil10:00 Helen11:00 Judith12:00 Judith12:00 DC1 DC3 DC110:30 DC27:00 Kamil10:00 DC1 Marie11:00 Me 13:30
52
DC1 Replication Basics Kamil10:00 Helen11:00 DC2 DC112:30 Kamil10:00 Helen11:00 Judith12:00 Judith12:00 DC1 DC3 DC112:30 DC213:30 Kamil10:00 DC1 Marie11:00 Me 13:30
53
DC1 Replication Basics Kamil10:00 Helen11:00 Kamil10:00 Helen11:00 Judith12:00 Judith12:00 DC1 DC3 DC112:30 DC213:30 Marie11:00 DC2 14:15
54
USN Each object modification increments USN for that object and for the whole DC Each DC remembers USNs of its replication partners repadmin /showutdvec
55
USN 2 USN 5001 3 USN 3001 1 USN 1001 25001 33001 11001 33001 11001 25001
56
USN 2 USN 5001 3 USN 3001 1 USN 1003 25001 33001 1 3 11001 25001 Kamil1002 John1003 1001
57
USN 2 USN 5001 3 USN 3001 1 USN 1003 25001 33001 1 3 11001 25001 Kamil1002 John1003 Notify Give me 1002, 3 1001
58
USN 2 USN 5003 3 USN 3001 1 USN 1003 25001 33001 11003 33001 11001 25001 Kamil5002 John5003 Kamil1002 John1003
59
USN 2 USN 5004 3 USN 3001 1 USN 1003 25001 33001 11003 33001 11001 25001 Kamil5002 John5003 Maria5004 Kamil1002 John1003
60
USN 2 USN 5004 3 USN 3004 1 USN 1003 25001 33001 11003 33001 11003 25004 Kamil3002 John3003 Kamil5002 John5003 Maria5004 Maria3004 Kamil1002 John1003
61
2 1 1 1 1 USN 2 USN 5004 3 USN 3004 1 USN 1003 25001 33001 11003 33001 11003 25004 Kamil John Kamil 1002 John1003 Kamil John Maria Kamil John 5002 5003 5004 2 1 1 Kamil John Kamil John Maria 3002 3003 3004
62
2 1 1 1 1 USN 2 USN 5004 3 USN 3004 1 USN 1003 25001 33004 11003 33001 11003 25004 Kamil John Kamil 1002 John1003 Kamil John Maria Kamil John 5002 5003 5004 2 1 1 Kamil John Kamil John Maria 3002 3003 3004 Maria2
63
Active Directory Replication Issues and Troubleshooting
64
The Three Problems Single DC offline for a long time not so long as tombstone! authentication problem Tombstone lifetime two separate DC zones not a “business” consistency problem USN rollback restore from snapshot, image, manual backup total inconsistency!
65
DC Offline for Long Time DC1 DC2 DC3 DC2PWD21 DC3PWD31 PWD21 Month 0 OLD PWD- PWD31 OLD PWD- MyPWD11
66
DC Offline for Long Time DC1 DC2 DC3 DC2PWD21 DC3PWD31 PWD22 Month 1 OLD PWD21 PWD32 OLD PWD31 MyPWD11
67
DC Offline for Long Time DC1 DC2 DC3 DC2PWD21 DC3PWD31 PWD23 Month 2 OLD PWD22 PWD33 OLD PWD32 MyPWD11
68
PWD 21 DC Offline for Long Time DC1 DC2 DC3 DC2PWD21 DC3PWD31 PWD23 Month 3 OLD PWD22 PWD33 OLD PWD32 Kerberos KDC TGS Ticket MyPWD11
69
PWD 23 DC Offline for Long Time DC1 DC2 DC3 DC2PWD21 DC3PWD31 PWD23 Month 3 OLD PWD22 PWD33 OLD PWD32 KDC Disabled TGS Ticket Kerberos KDC MyPWD11
70
DC Isolated for Long Time DC1 DC2 DC3 MyPWD13 Month 3 Kerberos KDC DC1PWD11 DC1PWD11 KDC Disabled PWD 13 TGT Ticket
71
DC Isolated for Long Time DC1 DC2 DC3 Month 3 DC1PWD14 DC1PWD14 NETDOM RESETPWD PWD 14 TGT Ticket MyPWD14 KDC Disabled
72
Lingering Objects When DC didn’t replicate during the tombstoneLifetime, it halts replication Can be restored by Allow Replication with Divergent and Corrupt Partner HKLM\System\CCS\Services\NTDS\Parameters turn on, replicate, turn off
73
DC4 DC3 DC2 DC1 Objects and Tombstones Frank Stan Tania Frank Stan Tania Frank Stan Tania Frank Stan Tania
74
DC4 DC3 DC2 DC1 Objects and Tombstones Frank Stan Tania Frank Stan Tania Frank Stan Tania Frank Stan Tania
75
DC4 DC3 DC2 DC1 Objects and Tombstones Frank Stan Tania Frank Stan Tania Frank Stan Tania Frank Stan Tania
76
DC4 DC3 DC2 DC1 Objects and Tombstones Frank Stan Tania Frank Stan Tania Frank Stan Tania Frank Stan Tania
77
DC4 DC3 DC2 DC1 Garbage Collection 1/day Frank Tania Frank Stan Tania Frank Stan Tania Frank Tania
78
DC4 DC3 DC2 DC1 Garbage Collection 1/day Frank Tania Frank Tania Frank Tania Frank Tania
79
DC4 DC3 DC2 DC1 Lingering Objects Frank Stan Tania Frank Stan Tania Frank Stan Tania Frank Stan Tania
80
DC4 DC3 DC2 DC1 Lingering Objects Frank Stan Tania Frank Stan Tania Frank Stan Tania Frank Stan Tania
81
DC4 DC3 DC2 DC1 Lingering Objects Frank Tania Frank Stan Frank Tania Frank Stan Tania
82
DC4 DC3 DC2 DC1 Lingering Objects Frank Tania Frank Stan Frank Tania Frank Stan Tania
83
Possible Problems Inconsistent distributed database Proliferation of partial objects after modification of some attributes Allow Replication with Divergent and Corrupt Partner blocks replication after tombstone lifetime Strict Replication Consistency detects partial objects if replication allowed
84
Lingering Objects
85
Strict Replication Consistency HKLM\System\CCS\Services\NTDS\Parameters 1 – do not replicate 0 – request full copy from source By default only on new Windows 2003+ installations
86
Automatic Repair Philosphy? Business logic says “deleted already” should we investigate? Metadata cleanup? we may need some data from the vesel Remove lingering objects
87
Removing Lingering Objects REPADMIN /RemoveLingeringObjects target sourceGUID DN /advisory_mode sourceGUID – healthy DC’s GUID (without {}) target – suspected DC’s name with lingering objects DN – naming context DN /advisory_mode just logs the found objects (on the ill DC)
88
Lingering Object found/deleted
89
Correct Registry Settings Long term normal operation Strict consistency = 1 Allow divergent partner = 0 Temporary repair operation Strict consistency = 1 Allow divergent partner = 1
90
USN Rollback May or may not be detected Cannot be repaired not always lingering objects! DC must be denoted/repromoted unplug network DCPROMO /forceremoval NTDSUTIL Roles NTDSUTIL Metadata Cleanup
91
USN Rollback 1001 DC1 2 USN 5001 1 33001 Snapshot 1001
92
USN Rollback Kamil1002 John1003 Judith1004 Helen1005 1001 DC1 Eva1006 2 USN 5001 1 33001 Snapshot 1001
93
USN Rollback Kamil1002 John1003 Judith1004 Helen1005 1001 DC1 Eva1006 2 USN 5001 11006 33001 Snapshot Kamil1002 John1003 Judith1004 Helen1005 Eva1006
94
Restore 1001 DC1 2 USN 5001 11006 33001 Restore Kamil1002 John1003 Judith1004 Helen1005 Eva1006
95
USN Rollback (Detectable) 1001 DC1 2 USN 5001 11006 33001 Restore Kamil1002 John1003 Judith1004 Helen1005 Eva1006
96
USN Rollback (Detectable) 1001 DC1 2 USN 5001 11006 33001 Restore Kamil1002 John1003 Judith1004 Helen1005 Eva1006 Frank1002 Stan1003
97
USN Rollback (Detectable)
101
USN Rollback (Non-detect.) Frank1002 Stan1003 1001 DC1 2 USN 5001 11006 33001 Tania1004 Mark1005 Martin1006 Victor1007 Leo1008 Restore Kamil1002 John1003 Judith1004 Helen1005 Eva1006
102
USN Rollback (Non-detect.) Frank1002 Stan1003 1001 DC1 2 USN 5001 11008 33001 Tania1004 Mark1005 Martin1006 Victor1007 Leo1008 Restore Victor1007 Leo1008 Kamil1002 John1003 Judith1004 Helen1005 Eva1006
103
Restoring VM Snapshots Restore offline HKLM\System\CurrentControlSet\Services\NTDS Database Restored from Backup = DWORD = 1 Restart NTDS service changes InvocationID of the database instance
104
Ing. Ondřej Ševeček | GOPAS a.s. | MCM: Directory Services | MVP: Enterprise Security | ondrej@sevecek.com | www.sevecek.com |
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.