The major departure from flotsam is to send only one message per destination region, as opposed to one message per group member. This reduces messaging considerably in large groups that have clusters of members in certain regions.
This is a test setup failure since code paths when adding a duplicate root scene presence now require the EntityTransferModule to be present.
Test fixed by adding this module to test setup
These were genuine failures caused by ScenePresence.CompleteMovement() waiting for an UpdateAgent from NPC introduction that would never come.
Instead, we do not wait if the agent is an NPC.
- Child and root agents are only closed after 15 sec, maybe
- If the user comes back, they aren't closed, and everything is reused
- On the receiving side, clients and scene presences are reused if they already exist
- Caps are always recreated (this is where I spent most of my time!). It turns out that, because the agents carry the seeds around, the seed gets the same URL, except for the root agent coming back to a far away region, which gets a new seed (because we don't know what was its seed in the departing region, and we can't send it back to the client when the agent returns there).
This is because a returning viewer by teleport before 15 seconds are up will be disrupted by the close.
The 2 second delay is within the scope where a normal viewer would not allow a teleport back anyway.
Simulator/0.2 (V2) protocol will continue with the longer delay since this is actually the behaviour viewers get from the ll grid
and an early close causes other issues (avatar being sent to infinite locations temporarily, etc.)
This prevents an issue if the user teleports back to the neighbour simulator of a source before 15 seconds have elapsed.
This more closely emulates observed linden behaviour, though the timeout there is 50 secs and applies to all the pre-teleport agents.
Currently sticks a DoNotClose flag on ScenePresence though this may be temporary as possibly it could be incorporated into the ETM state machine
In this new protocol, and as committed before, the viewer is not sent EnableSimulator/EstablishChildCommunication for the destination. Instead, it is sent TeleportFinish directly. TeleportFinish, in turn, makes the viewer send a UserCircuitCode packet followed by CompleteMovementIntoRegion packet. These 2 packets tend to occur one after the other almost immediately to the point that when CMIR arrives the client is not even connected yet and that packet is ignored (there might have been some race conditions here before); then the viewer sends CMIR again within 5-8 secs. But the delay between them may be higher in busier regions, which may lead to race conditions.
This commit improves the process so there are are no race conditions at the destination. CompleteMovement (triggered by the viewer) waits until Update has been sent from the origin. Update, in turn, waits until there is a *root* scene presence -- so making sure CompleteMovement has run MakeRoot. In other words, there are two threadlets at the destination, one from the viewer and one from the origin region, waiting for each other to do the right thing. That makes it safe to close the agent at the origin upon return of the Update call without having to wait for callback, because we are absolutely sure that the viewer knows it is in th new region.
Note also that in the V1 protocol, the destination was getting UseCircuitCode from the viewer twice -- once on EstablishAgentCommunication and then again on TeleportFinish. The second UCC was being ignored, but it shows how we were not following the expected steps...
If this consistently increases then this is a problem since it means the simulator is receiving more requests than it can distribute to other parts of the code.
If we're not receiving packets with multiple threads (m_asyncPacketHandling) then this is critical since it will limit the number of incoming UDP requests that the region can handle and affects packet loss.
If m_asyncPacketHandling then this is less critical though a long process will increase the scope for threads to race.
This is an experimental stat which may be changed.
AgentUpdate packet. This fixes the problem with vehicles not moving forward
after the first up-arrow.
Code to fix a potential exception when using different IClientAPIs.
into the linkset implementation classes.
Add HasSomeCollision attribute that remembers of any component of
a linkset has a collision.
Update vehicle code (BSDynamic) to use the HasSomeCollision in place of
IsColliding to make constraint based linksets properly notice the ground.
Add linkset functions to change physical attributes of all the members
of a linkset.
- The existing event to scene has been split into 2: OnAgentUpdate and OnAgentCameraUpdate, to better reflect the two types of updates that the viewer sends. We can run one without the other, which is what happens when the avie is still but the user is camming around
- Added thresholds (as opposed to equality) to determine whether the update is significant or not. I thin these thresholds are ok, but we can play with them later
- Ignore updates of HeadRotation, which were problematic and aren't being used up stream
Added here since it was the most convenient place
Number is in the last column, "Sig. AgentUpdates" along with percentage of all AgentUpdates
Percentage largely falls over time, most cpu for processing AgentUpdates may be in UDP processing as turning this off even earlier (with "debug lludp toggle agentupdate" results in a big cpu fall
Also tidies up display.
Enabling this will stop anybody from moving on a sim, though all other updates should be unaffected.
Appears to make some cpu difference on very basic testing with a static standing avatar (though not all that much).
Need to see the results with much higher av numbers.
This appears to improve cpu usage since launching a new thread is more expensive than performing a small amount of inline logic.
However, needs testing at scale.
move around when standing on a stationary object.
Create proper linkage between BSCharacter and its actor by generating
a UpdatedProperties event the same way BSPrim does.
successfully tested, and I'm merging back those changes, which proved to
be good.
Revert "Revert "Cleared up much confusion in PollServiceRequestManager. Here's the history:""
This reverts commit fa2370b32e.
When Melanie added the web fetch inventory throttle to core, she made the long poll requests (EQs) effectively be handled on an active loop. All those requests, if they existed, were being constantly dequeued, checked for events (which most often they didn't have), and requeued again. This was an active loop thread on a 100ms cycle!
This fixes the issue. Now the inventory requests, if they aren't ready to be served, are placed directly back in the queue, but the long poll requests aren't placed there until there are events ready to be sent or timeout has been reached.
This puts the LongPollServiceWatcherThread back to 1sec cycle, as it was before.
This reverts commit 21a09ad3ad.
After more analysis and discussion, it is apparant that the Count(), Contains() and GetQueueArray() cannot be made thread-safe anyway without external locking
And this change appears to have a positive impact on performance.
I still believe that Monitor.Exit() will not release any thread for Monitor.Wait(), as per http://msdn.microsoft.com/en-gb/library/vstudio/system.threading.monitor.exit%28v=vs.100%29.aspx
so this should in theory make no difference, though mono implementation issues could possibly be coming into play.
This reverts commit 42e2a0d66e
Reverting because unfortunately this introduces race conditions because Contains(), Count() and GetQueueArray() may now end up returning the wrong result if another thread performs a simultaneous update on m_queue.
Code such as PollServiceRequestManager.Stop() relies on the count being correct otherwise a request may be lost.
Also, though some of the internal queue methods do not affect state, they are not thread-safe and could return the wrong result generating the same problem
lock() generates Monitor.Enter() and Monitor.Exit() under the covers. Monitor.Exit() does not cause Monitor.Wait() to exist, only Pulse() and PulseAll() will do this
Reverted with agreement.
This adds explicit cap poll handler supporting to the Caps classes rather than relying on callers to do the complicated coding.
Other refactoring was required to get logic into the right places to support this.
17:14:28 - [APPLICATION]:
APPLICATION EXCEPTION DETECTED: System.UnhandledExceptionEventArgs
Exception: System.InvalidOperationException: Operation is not valid due to the current state of the object
at System.Collections.Generic.Queue`1[OpenSim.Region.ClientStack.Linden.WebFetchInvDescModule+aPollRequest].Peek () [0x00011] in /root/install/mono-3.1.0/mono/mcs/class/System/System.Collections.Generic/Queue.cs:158
at System.Collections.Generic.Queue`1[OpenSim.Region.ClientStack.Linden.WebFetchInvDescModule+aPollRequest].Dequeue () [0x00000] in /root/install/mono-3.1.0/mono/mcs/class/System/System.Collections.Generic/Queue.cs:140
at OpenSim.Framework.DoubleQueue`1[OpenSim.Region.ClientStack.Linden.WebFetchInvDescModule+aPollRequest].Dequeue (TimeSpan wait, OpenSim.Region.ClientStack.Linden.aPollRequest& res) [0x0004e] in /home/avacon/opensim_2013-07-14/OpenSim/Framework/Util.cs:2297
Revert "Trying to reduce CPU usage on logins and TPs: trying radical elimination of all FireAndForgets throughout CompleteMovement. There were 4."
This reverts commit 6825377380.
BestAvatarResponsiveness introduces the region rez delay in cases where the region is full of avatars with lots of attachments, which is the case in CC load tests. In that case, the inworld prims are sent only after all avatar attachments are sent. Not recommended for regions with heavy avatar traffic!
Justin, if you read this, there's a long story here. Some time ago you placed SendInitialDataToMe at the very beginning of client creation (in LLUDPServer). That is problematic, as we discovered relatively recently: on TPs, as soon as the client starts getting data from child agents, it starts requesting resources back *from the simulator where its root agent is*. We found this to be the problem behind meshes missing on HG TPs (because the viewer was requesting the meshes of the receiving sim from the departing grid). But this affects much more than meshes and HG TPs. It may also explain cloud avatars after a local TP: baked textures are only stored in the simulator, so if a child agent receives a UUID of a baked texture in the destination sim and requests that texture from the departing sim where the root agent is, it will fail to get that texture.
Bottom line: we need to delay sending the new simulator data to the viewer until we are absolutely sure that the viewer knows that its main agent is in a new sim. Hence, moving it to CompleteMovement.
Now I am trying to tune the initial rez delay that we all experience in the CC. I think that when I fixed the issue described above, I may have moved SendInitialDataToMe to much later than it should be, so now I'm moving to earlier in CompleteMovement.
A separate PhysicsActor variable is used in case some other thread removes the PhysicsActor whilst this code is executing.
If this is now impossible please revert - just adding this now whilst I remember.
Also makes method comment into proper method doc.
This was probably the mistake.
The other handlers are named RenderMaterials as well but this actully has no affect apart from on stats, due to a (counterintuitive) disconnect between the registration name and the name of the request handler.
Will be tested very soon and reverted if this still does not work.