[Casper] Timing logins and applicaion startup
Thomas Larkin
tlarki at kckps.org
Thu Feb 12 08:11:52 PST 2009
Are there any errors with the log ins? Like if you ssh into a client and watch it's system log while a user tries to log in, does it produce any errors? If all your servers are 10.5.5 you should be in good shape. I did notice vast amounts of improvements when we ditched 10.5.3 and 10.5.4 on our servers. 10.5.4 was a pile of dung if you ask me. Also, make sure you are using the correct version of server tools, as this can also cause issues if you are using mismatched versions.
I would suggest you watch a client log in and see what happens by ssh into it and watching the systemlog while it tires to log in.
Also, have there been any changes to your servers and I assume that at one point in time this was all working great?
When we had our similar problems I got an Apple engineer involved and they pretty much told me that OD Masters and Replicas are kind of built around the idea of having no more than 1,000 simultaneous connections at once.
Also, if you do folder syncing you may want to look at your AFP data throughput charts in Server Admin and see if they fall way below for any reason, then also check out your servers CPU usage history as well.
___________________________
Thomas Larkin
TIS Department
KCKPS USD500
tlarki at kckps.org
blackberry: 913-449-7589
office: 913-627-0351
>>> Clinton Blackmore <clinton.blackmore at westwind.ab.ca> 02/12/09 9:56 AM >>>
On 12-Feb-09, at 8:13 AM, Thomas Larkin wrote:
Are you by chance running 10.5.3 or 10.5.4? There were known bugs that caused all sorts of sync and log in issues and I saw them myself. Where it would take literally, 2 minutes just to log in with a network account.
Also, how many clients are bound to your Directory Servers?
Most of our clients are running 10.5.4. A handful go back as far as 10.5.2, and some are up-to-date. [This is not counting our older machines that are running Tiger, but they aren't a concern right now.] Most of our 12 directory replicas are running 10.5.5, although the master is running 10.5.6.
For number of clients, I ran:
dscl /LDAPv3/[IP of ODM] list Computers | wc -l
dscl /LDAPv3/[IP of ODM] list Users | wc -l
We currently have 1085 computers in our directory, and 4463 users.
We had a similar login-failure issue three of four months ago, and, after trolling through the logs availed us nothing, we instated a new open directory master. [One of my co-workers did it; I think he imaged a server, made it a replica, and then promoted it and made all the other replicas use it as the master.] Things worked great after we did that, until the day that I tried to give a user lesser directory administration privileges, at which point slapd on the master went off the rails and the CPU usage was at 100% for hours at a time. I revoked the privileges, but we have been having problems since then. [Further, we don't recall exactly, but out first master may have started acting up when we gave a user sub-diradmin privileges.] I can not fathom why this would cause the issue, but it is our best suspicion.
Another symptom is sometimes a machine will show that network users are available, but they can not authenticate. On such a machine, dscl sees the LDAP server and the Users directory, but listing said directory brings up zero results. Rebooting or rebinding to the directory often fixes this. So far as we can tell, there is no pattern involving which users or machines will have problems. Just yesterday I saw a user take over 5 minutes to log in to a 2008 iMac connected via a 100 MB/s (or maybe even gigabit) network, while 2/3s of his class logged in without a problem [except for Word crashing for some of them].
While I am on the topic, can anyone recommend tools for merging or correlating log files?
Cheers,
Clinton Blackmore
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://list.jamfsoftware.com/pipermail/casper/attachments/20090212/577d2048/attachment.htm
More information about the Casper
mailing list