Commit Graph

17149 Commits (be4034fd1c20f34eccacc34f206c5cd2b039e885)

Author SHA1 Message Date
Justin Clark-Casey (justincc) cbb47f8489 Merge branch 'cpu-performance' of ssh://opensimulator.org/var/git/opensim into cpu-performance 2013-07-18 22:43:15 +01:00
Justin Clark-Casey (justincc) b2b29b7ec0 Fix up a temporary debugging change from last commit which stopped "lludp stop out" from actually doing anything 2013-07-18 22:42:25 +01:00
Diva Canto 27377194cd Changed the timoeut of EQ 502s (no events) to 50 secs. The viewer post requests timeout in 60 secs.
There's plenty of room for improvement in handling the EQs. Some other time...
2013-07-18 13:48:56 -07:00
Justin Clark-Casey (justincc) 8c6761c152 Do some simple queue empty checks in the main outgoing udp loop instead of always performing these on a separate fired thread.
This appears to improve cpu usage since launching a new thread is more expensive than performing a small amount of inline logic.
However, needs testing at scale.
2013-07-18 21:28:36 +01:00
Diva Canto 553d9cc5d2 Applying the same fix here that dan lake applied to master -- unfortunately I can't cherry-pick because that commit has 2 parents... 2013-07-18 07:52:14 -07:00
Diva Canto c685cc1799 Revert "This is a completely unreasonable thing to do, effectively defying the purpose of BlockingQueues. Trying this, to see the effect on CPU."
This reverts commit 5232ab0496.
2013-07-17 20:42:38 -07:00
Justin Clark-Casey (justincc) 1ba5a05cf7 try Hacking in an AutoResetEvent to control the outgoing UDP loop instead of a continuous loop with sleeps.
Does appear to have a cpu impact but may need further tweaking
2013-07-18 01:17:46 +01:00
Justin Clark-Casey (justincc) 0af3b5ed9a Revert "Put in temporary hack for performnace 'queue-empty' logic on a persistent thread rather than through fire and forget"
This reverts commit b402220dbb.

Eliminating fire and forget here does not appear to make a significant difference.
2013-07-18 00:51:10 +01:00
Justin Clark-Casey (justincc) a94a43d249 Revert "Properly remove the hack queue update thread when the voewr shuts down"
This reverts commit 7c544c0d4e.
2013-07-18 00:50:16 +01:00
Justin Clark-Casey (justincc) 7c544c0d4e Properly remove the hack queue update thread when the voewr shuts down
No functional change.
2013-07-18 00:39:28 +01:00
Justin Clark-Casey (justincc) b402220dbb Put in temporary hack for performnace 'queue-empty' logic on a persistent thread rather than through fire and forget
May not scale since this gives each client its own thread.
2013-07-18 00:30:22 +01:00
Diva Canto 5232ab0496 This is a completely unreasonable thing to do, effectively defying the purpose of BlockingQueues. Trying this, to see the effect on CPU. 2013-07-17 14:36:55 -07:00
Diva Canto 5f95f4d78e Now trying DoubleQueue instead of BlockingQueue for the PollServiceRequestManager. 2013-07-17 14:09:04 -07:00
Diva Canto 1d3deda10c I confuse myself. Let's try this variable name instead. 2013-07-17 13:26:15 -07:00
Diva Canto af792bc7f2 Do the same trick that dahlia did for Dequeue(timeout) 2013-07-17 13:23:29 -07:00
Diva Canto f4317dc26d Putting the requests back in the queue while testing for count >0 is not the smartest move... 2013-07-17 12:57:34 -07:00
Diva Canto 0f5b616fb0 Didn't mean to commit this change in BlockingQueue.cs 2013-07-17 12:02:00 -07:00
Diva Canto 2b8de2c404 Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-17 11:19:56 -07:00
Diva Canto e46459ef21 Cleared up much confusion in PollServiceRequestManager. Here's the history:
When Melanie added the web fetch inventory throttle to core, she made the long poll requests (EQs) effectively be handled on an active loop. All those requests, if they existed, were being constantly dequeued, checked for events (which most often they didn't have), and requeued again. This was an active loop thread on a 100ms cycle!
This fixes the issue. Now the inventory requests, if they aren't ready to be served, are placed directly back in the queue, but the long poll requests aren't placed there until there are events ready to be sent or timeout has been reached.
This puts the LongPollServiceWatcherThread back to 1sec cycle, as it was before.
2013-07-17 11:19:36 -07:00
Robert Adams 2c8bf4aaa6 BulletSim: fix small bug where everything looked like it was colliding
before the first simulator step.
2013-07-17 10:19:44 -07:00
Diva Canto 894554faf6 Removed the MapItems thread. Redirected the map items requests to the services throttle thread. Didn't change anything in how that processor is implemented, for better or for worse. 2013-07-16 20:28:48 -07:00
Diva Canto 9432f3c94d Improvements to the ServiceThrottleModule: added a category and an itemid to the interface, so that duplicate requests aren't enqueued more than once. 2013-07-16 19:04:30 -07:00
Diva Canto 5f27aaa6dd UserManagementModule: in the continuation, call the method that also looks up the cache, because the resource may be here in the meantime 2013-07-16 18:22:42 -07:00
Diva Canto 8bad56cb46 Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-16 17:53:49 -07:00
Diva Canto d4720bd721 Added config var to fiddle with the Interval for the service throttle thread 2013-07-16 17:53:05 -07:00
Dan Lake 9f129938c9 Attachments module only registers when enabled. This enables alternative attachments module implementations. All calls to Scene.AttachmentsModule are checking for null. Ideally, if we support disabling attachments then we need a null attachments module to register with the scene. 2013-07-16 17:43:36 -07:00
Diva Canto 9f578cf0c8 Deleted a couple of verbose messages 2013-07-16 17:18:11 -07:00
Diva Canto 0419852598 Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-16 17:15:08 -07:00
Diva Canto a006caabbc Added IServiceThrottleModule.cs 2013-07-16 17:06:54 -07:00
Diva Canto 99a600753e Changed the name to ServiceThrottle/ServiceThrottleModule in order to reflect its more generic nature. 2013-07-16 17:06:17 -07:00
Diva Canto 3fbd2c54bc Eliminated the UserManagement/UserManagementModule throttle thread. Made the other one generic, taking any continuation. 2013-07-16 17:04:32 -07:00
Justin Clark-Casey (justincc) cbc3576ee2 minor: Add warning method doc about possibly inconsistent results returned from BlockingQueue.Contains(), Count() and GetQueueArray() 2013-07-16 23:14:53 +01:00
Justin Clark-Casey (justincc) 50b8ab60f2 Revert "Revert "MSDN documentation is unclear about whether exiting a lock() block will trigger a Monitor.Wait() to exit, so avoid some locks that don't actually affect the state of the internal queues in the BlockingQueue class.""
This reverts commit 21a09ad3ad.

After more analysis and discussion, it is apparant that the Count(), Contains() and GetQueueArray() cannot be made thread-safe anyway without external locking
And this change appears to have a positive impact on performance.
I still believe that Monitor.Exit() will not release any thread for Monitor.Wait(), as per http://msdn.microsoft.com/en-gb/library/vstudio/system.threading.monitor.exit%28v=vs.100%29.aspx
so this should in theory make no difference, though mono implementation issues could possibly be coming into play.
2013-07-16 23:00:07 +01:00
Justin Clark-Casey (justincc) 21a09ad3ad Revert "MSDN documentation is unclear about whether exiting a lock() block will trigger a Monitor.Wait() to exit, so avoid some locks that don't actually affect the state of the internal queues in the BlockingQueue class."
This reverts commit 42e2a0d66e

Reverting because unfortunately this introduces race conditions because Contains(), Count() and GetQueueArray() may now end up returning the wrong result if another thread performs a simultaneous update on m_queue.
Code such as PollServiceRequestManager.Stop() relies on the count being correct otherwise a request may be lost.
Also, though some of the internal queue methods do not affect state, they are not thread-safe and could return the wrong result generating the same problem
lock() generates Monitor.Enter() and Monitor.Exit() under the covers.  Monitor.Exit() does not cause Monitor.Wait() to exist, only Pulse() and PulseAll() will do this
Reverted with agreement.
2013-07-16 22:03:49 +01:00
Diva Canto e0f0b88dec In the pursuit of using less CPU: now trying to avoid blocking queues altogether. Instead, this uses a timer. No sure if it's better or worse, but worth the try. 2013-07-16 13:01:39 -07:00
Diva Canto 6da50d34df Actually use DoubleQueue in UserManagement/UserManagementModule 2013-07-16 07:19:13 -07:00
Diva Canto 5a01ffa515 High CPU hunt: try a different blocking queue, DoubleQueue 2013-07-16 07:15:14 -07:00
dahlia 6dd454240f revert last commit which seems to conflict with DoubleQueue internals. The random crash might be in DoubleQueue instead. See http://pastebin.com/XhNBNqsc 2013-07-16 02:03:01 -07:00
dahlia 70aa77f520 add locking to internal queue in WebFetchInvDescModule; lack of which caused a random crash in a load test yesterday 2013-07-16 01:31:09 -07:00
dahlia 42e2a0d66e MSDN documentation is unclear about whether exiting a lock() block will trigger a Monitor.Wait() to exit, so avoid some locks that don't actually affect the state of the internal queues in the BlockingQueue class. 2013-07-16 01:12:56 -07:00
Justin Clark-Casey (justincc) e8e073aa97 Simplify EventQueue cap setup so that it is also stat monitored.
Curiously, the number of requests received is always one greater than that shown as handled - needs investigation
2013-07-16 00:05:45 +01:00
Justin Clark-Casey (justincc) eb14e5a175 Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-15 23:28:02 +01:00
Justin Clark-Casey (justincc) 1b7b664c86 Add request received/handling stats for caps which are served by http poll handlers.
This adds explicit cap poll handler supporting to the Caps classes rather than relying on callers to do the complicated coding.
Other refactoring was required to get logic into the right places to support this.
2013-07-15 23:27:46 +01:00
Diva Canto 68fbf7eebb Revert "Puts RequestImage (UDP) back to asyn -- CPU spike hunt"
This reverts commit b060ce96d9.
2013-07-15 12:34:10 -07:00
Diva Canto 687c1a420a Guard against null ref 2013-07-15 12:33:31 -07:00
Diva Canto b060ce96d9 Puts RequestImage (UDP) back to asyn -- CPU spike hunt 2013-07-15 12:05:31 -07:00
Diva Canto 864f15ce4d Revert the revert
Revert "Trying to hunt the CPU spikes recently experienced."

This reverts commit ac73e70293.
2013-07-15 11:52:26 -07:00
Diva Canto fbb01bd280 Protect against null requests 2013-07-15 11:37:49 -07:00
Diva Canto ac73e70293 Trying to hunt the CPU spikes recently experienced.
Revert "Comment out old inbound UDP throttling hack. This would cause the UDP"

This reverts commit 38e6da5522.
2013-07-15 11:27:49 -07:00
Diva Canto 60325f81d8 This might address the following observed exception:
17:14:28 - [APPLICATION]:
APPLICATION EXCEPTION DETECTED: System.UnhandledExceptionEventArgs
Exception: System.InvalidOperationException: Operation is not valid due to the current state of the object
  at System.Collections.Generic.Queue`1[OpenSim.Region.ClientStack.Linden.WebFetchInvDescModule+aPollRequest].Peek () [0x00011] in /root/install/mono-3.1.0/mono/mcs/class/System/System.Collections.Generic/Queue.cs:158
  at System.Collections.Generic.Queue`1[OpenSim.Region.ClientStack.Linden.WebFetchInvDescModule+aPollRequest].Dequeue () [0x00000] in /root/install/mono-3.1.0/mono/mcs/class/System/System.Collections.Generic/Queue.cs:140
  at OpenSim.Framework.DoubleQueue`1[OpenSim.Region.ClientStack.Linden.WebFetchInvDescModule+aPollRequest].Dequeue (TimeSpan wait, OpenSim.Region.ClientStack.Linden.aPollRequest& res) [0x0004e] in /home/avacon/opensim_2013-07-14/OpenSim/Framework/Util.cs:2297
2013-07-15 10:29:42 -07:00
Diva Canto af02231a7b Added SQLite version of hg travel data store. UNTESTED. Hope it works! 2013-07-14 16:03:46 -07:00
Diva Canto b0140383da Cleanup old hg sessions (older than 2 days) 2013-07-14 15:47:54 -07:00
Diva Canto e33ac50388 HG UAS: Moved hg-session data from memory to DB storage. This makes it so that traveling info survives Robust resets. It should also eliminate the cause of empty IP addresses in agent circuit data that we saw in CC grid. MySQL only. 2013-07-14 14:31:20 -07:00
Diva Canto 5939529036 Minor typo in log message 2013-07-14 14:29:10 -07:00
Diva Canto c8dcb8474d Let's go easy on authenticating ChildAgentUpdates, otherwise this will be chaotic while ppl are using different versions of opensim. Warning only, but no enforcement. 2013-07-14 10:26:05 -07:00
Diva Canto 98f59ffed5 Fix broken tests -- the test setup was wrong... sigh. 2013-07-14 09:22:55 -07:00
Diva Canto c61ff917ef Authenticate ChildAgentUpdate too. 2013-07-14 09:21:28 -07:00
Diva Canto f3b3e21dea Change the auth token to be the user's sessionid. 2013-07-14 07:28:40 -07:00
Diva Canto fcb0349d56 And this fixes the other failing tests. Justin, the thread pool is not being initialized in the tests! 2013-07-13 23:01:41 -07:00
Diva Canto e4f741f006 This should fix the failing test. 2013-07-13 22:52:51 -07:00
Diva Canto a2ee887c6d Deleted a line too many 2013-07-13 22:32:52 -07:00
Diva Canto b4f1b9acf6 Guard against unauthorized agent deletes. 2013-07-13 21:28:46 -07:00
Diva Canto 931eb892d9 Deleted GET agent all around. Not used. 2013-07-13 17:56:42 -07:00
Diva Canto 4d93870fe5 Gatekeeper: stop bogus agents earlier, here at the Gatekeeper. No need to bother the sim. 2013-07-13 17:52:05 -07:00
Diva Canto 5a1d6727e1 Some more debug to see how many threads are available. 2013-07-13 11:39:17 -07:00
Diva Canto bc405a6a34 That didn't fix the problem.
Revert "Trying to reduce CPU usage on logins and TPs: trying radical elimination of all FireAndForgets throughout CompleteMovement. There were 4."

This reverts commit 6825377380.
2013-07-13 11:30:37 -07:00
Diva Canto 6825377380 Trying to reduce CPU usage on logins and TPs: trying radical elimination of all FireAndForgets throughout CompleteMovement. There were 4. 2013-07-13 11:11:18 -07:00
Diva Canto 3a26e366d2 This commit effectively reverses the previous one, but it's just to log that we found the root of the rez delay: the priority scheme BestAvatarResponsiveness, which is currently the default, was the culprit. Changing it to FrontBack made the region rez be a lot more natural.
BestAvatarResponsiveness introduces the region rez delay in cases where the region is full of avatars with lots of attachments, which is the case in CC load tests. In that case, the inworld prims are sent only after all avatar attachments are sent. Not recommended for regions with heavy avatar traffic!
2013-07-13 10:35:41 -07:00
Diva Canto ff4ad60207 Same issue as previous commit. 2013-07-13 10:05:11 -07:00
Diva Canto ccee2959f7 Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-13 09:53:05 -07:00
Diva Canto a412b1d682 Moved SendInitialDataToMe to earlier in CompleteMovement. Moved TriggerOnMakeRootAgent to the end of CompleteMovement.
Justin, if you read this, there's a long story here. Some time ago you placed SendInitialDataToMe at the very beginning of client creation (in LLUDPServer). That is problematic, as we discovered relatively recently: on TPs, as soon as the client starts getting data from child agents, it starts requesting resources back *from the simulator where its root agent is*. We found this to be the problem behind meshes missing on HG TPs (because the viewer was requesting the meshes of the receiving sim from the departing grid). But this affects much more than meshes and HG TPs. It may also explain cloud avatars after a local TP: baked textures are only stored in the simulator, so if a child agent receives a UUID of a baked texture in the destination sim and requests that texture from the departing sim where the root agent is, it will fail to get that texture.
Bottom line: we need to delay sending the new simulator data to the viewer until we are absolutely sure that the viewer knows that its main agent is in a new sim. Hence, moving it to CompleteMovement.
Now I am trying to tune the initial rez delay that we all experience in the CC. I think that when I fixed the issue described above, I may have moved SendInitialDataToMe to much later than it should be, so now I'm moving to earlier in CompleteMovement.
2013-07-13 09:46:58 -07:00
Diva Canto cd64a70c79 Added UploadBakedTexture/UploadBakedTextureServerConnector, so that this can eventually be served by a robust instance. NOT FINISHED YET. 2013-07-13 08:31:03 -07:00
Justin Clark-Casey (justincc) d06c85ea77 Reinsert PhysicsActor variable back into SOP.SubscribeForCollisionEvents() in order to avoid a race condition.
A separate PhysicsActor variable is used in case some other thread removes the PhysicsActor whilst this code is executing.
If this is now impossible please revert - just adding this now whilst I remember.
Also makes method comment into proper method doc.
2013-07-13 00:29:07 +01:00
Justin Clark-Casey (justincc) b4cb644a05 Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-13 00:03:23 +01:00
Justin Clark-Casey (justincc) 3d118fb580 In co-op termination, extend EventWaitHandle to give this an indefinite lifetime in order to avoid a later RemotingException if scripts are being loaded into their own domains.
This is necessary because XEngineScriptBase now retains a reference to an EventWaitHandle when co-op termination is active.
Aims to address http://opensimulator.org/mantis/view.php?id=6634
2013-07-13 00:02:54 +01:00
Robert Adams fa02f28dbf Add ToOSDMap() overrides to the Stat subclass CounterStat.
Add a GetStatsAsOSDMap method to StatsManager which allows the filtered
fetching of stats for eventual returning over the internets.
2013-07-12 14:04:14 -07:00
Diva Canto 3d700bb42c Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-12 12:54:29 -07:00
Diva Canto 29f6ae199e Changed UploadBakedTextureModule so that it uses the same pattern as the others, in preparation for experiments to direct baked texture uploads to a robust instance. No functional or configuration changes -- should work exactly as before. 2013-07-12 12:53:58 -07:00
Robert Adams 65239b059f Enhance NullEstateData to remember stored estate values and return
them next time asked. This keeps any estate settings from being reset
when the estate dialog is opened in a region with null estate storage.
2013-07-11 20:55:32 -07:00
Robert Adams 1909ee70f8 Centralize duplicated code in SceneObjectPart for subscribing to
collision events. Improve logic for knowing when to add processing
routine to physics actor.
2013-07-11 16:57:07 -07:00
Diva Canto 83d1680057 Added a few more thingies to the asset client test to poke the threadpool. 2013-07-11 16:43:43 -07:00
Justin Clark-Casey (justincc) ba8f9c9d0a Try naming the materials handlers again, this time registering the POST as RenderMaterials
This was probably the mistake.
The other handlers are named RenderMaterials as well but this actully has no affect apart from on stats, due to a (counterintuitive) disconnect between the registration name and the name  of the request handler.
Will be tested very soon and reverted if this still does not work.
2013-07-11 23:51:10 +01:00
Justin Clark-Casey (justincc) 7c2e4786ce minor: remove some regression test logging switches accidentally left uncommented. 2013-07-11 23:19:55 +01:00
Justin Clark-Casey (justincc) e15a15688b minor: Take out unnecessary clumsy sleep at the end of regression Test404Response().
This wasn't actually necessary in the end but was accidentally left in.
2013-07-11 23:11:35 +01:00
Justin Clark-Casey (justincc) f57f49eede Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-11 23:05:10 +01:00
Justin Clark-Casey (justincc) 44e9849ed1 Fix regression where llHTTPRequests which did not get an OK response returned 499 and the exception message in the http_response event rather than the actual response code and body.
This was a regression since commit 831e4c3 (Thu Apr 4 00:36:15 2013)
This commit also adds a regression test for this case, though this currently only works with Mono
This aims to address http://opensimulator.org/mantis/view.php?id=6704
2013-07-11 23:02:30 +01:00
Diva Canto ee51a9f9c9 Added property to make for more flexible testing. 2013-07-11 14:23:37 -07:00
Diva Canto 51d106cff8 Added a test for the asset service 2013-07-11 14:21:57 -07:00
Diva Canto c4f1ec1fd6 Changed the UserProfileModule so that it's less greedy in terms of thread usage. 2013-07-11 10:21:20 -07:00
Diva Canto ea371a6f54 Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-11 09:48:15 -07:00
Diva Canto 604967b31e Switched UUIDNameRequest and RegionHandleRequest to Sync, because now they are also non-blocking handlers. 2013-07-11 09:47:46 -07:00
Diva Canto 3b48b6a792 Switched TransferRequest (UDP packet handler) to sync. The permissions checks may block, so they get a FireAndForget. Everything else is non-blocking. 2013-07-11 09:44:48 -07:00
dahlia 0120e858b7 remove names from Capability handlers (added by justincc in commit 013710168b) as they seem to disable the use of multiple access methods for a single Capability in MaterialsDemoModule 2013-07-10 22:30:41 -07:00
Diva Canto 9173130fde Switched RegionHandshakeReply to Sync, because it's not doing anything blocking. 2013-07-10 20:48:13 -07:00
Diva Canto fe5da43d15 EXPERIMENTAL: make RequestImage (UDP packet handler) sync instead of async. This _shouldn't_ screw things up, given that all this does is to dump the request in a queue. 2013-07-10 19:29:14 -07:00
Diva Canto bdaeb02863 show client stats: Fixed the requests/min. Also changed the spelling of the command, not without the dash. 2013-07-10 17:14:20 -07:00
Diva Canto 864a86983e Merge branch 'master' of ssh://opensimulator.org/var/git/opensim 2013-07-10 16:10:04 -07:00
Diva Canto 1b265b213b Added show client-stats [first last] command to expose what viewers are requesting. 2013-07-10 16:09:45 -07:00
Robert Adams 59d19f038a Remove a null reference exception in SimianPresenceServiceConnector that
occurs when GetGridUserInfo cannot find the requested user info.
2013-07-10 08:55:54 -07:00
Robert Adams 38e6da5522 Comment out old inbound UDP throttling hack. This would cause the UDP
reception thread to sleep for 30ms if the number of available user worker
threads got low. It doesn't look like any of the UDP packet types are
marked async so this check is 1) unnecessary and 2) really crazy since
it stops up the reception thread under heavy load without any indication.
2013-07-09 18:34:24 -07:00