This occurs on v2 teleport since the source region now waits 15 secs before closing the old child agent, which could still receive chat.
This commit introduces a ScenePresenceState.PreClose which is set before the wait, so that ChatModule can check for ScenePresenceState.Running.
This was theoretically also an issue on v1 teleport but since the pause before close was only 2 secs there, it was not noticed.
This message is seen on V2 if one attempts to quickly re-teleport from a source region where one had previously teleported to a non-neighbour and back within 15 secs.
The solution here is for the user to wait a short while.
This message can also be seen on any teleport protocol if one recieves multiple teleport attempts simultaneously. Probably still useful here to help identify misbehaving scripts.
This approach has problems if a client quits without sending a proper logout but then reconnects before the connection is closed due to inactivity.
In this case, the DoNotCloseAfterTeleport was wrongly set.
The simplest approach is to close child agents on teleport as quickly as possible so that races are very unlikely to occur
Hence, this code now closes child agents as the first action after a sucessful teleport.
The root cause was that v2 was only closing neighbour agents if the root connection also needed a close.
However, fixing this requires the neighbour regions also detect when they should not close due to re-teleports re-establishing the child connection.
This involves restructuring the code to introduce a scene presence state machine that can serialize the different add and remove client calls that are now possible with the late close of the
This commit appears to fix these issues and improve teleport, but still has holes on at least quick reteleporting (and possibly occasionally on ordinary teleports).
Also, has not been completely tested yet in scenarios where regions are running on different simulators
This stops OpenSimulator still trying to teleport the user if they hit cancel on the teleport screen or closed the viewer whilst the protocol was trying to create an agent on the remote region.
Ideally, the code may also attempt to tell the destination simulator that the agent should be removed (accounting for issues where the destination was not responding in the first place, etc.)
- Child and root agents are only closed after 15 sec, maybe
- If the user comes back, they aren't closed, and everything is reused
- On the receiving side, clients and scene presences are reused if they already exist
- Caps are always recreated (this is where I spent most of my time!). It turns out that, because the agents carry the seeds around, the seed gets the same URL, except for the root agent coming back to a far away region, which gets a new seed (because we don't know what was its seed in the departing region, and we can't send it back to the client when the agent returns there).
This is because a returning viewer by teleport before 15 seconds are up will be disrupted by the close.
The 2 second delay is within the scope where a normal viewer would not allow a teleport back anyway.
Simulator/0.2 (V2) protocol will continue with the longer delay since this is actually the behaviour viewers get from the ll grid
and an early close causes other issues (avatar being sent to infinite locations temporarily, etc.)
This prevents an issue if the user teleports back to the neighbour simulator of a source before 15 seconds have elapsed.
This more closely emulates observed linden behaviour, though the timeout there is 50 secs and applies to all the pre-teleport agents.
Currently sticks a DoNotClose flag on ScenePresence though this may be temporary as possibly it could be incorporated into the ETM state machine
In this new protocol, and as committed before, the viewer is not sent EnableSimulator/EstablishChildCommunication for the destination. Instead, it is sent TeleportFinish directly. TeleportFinish, in turn, makes the viewer send a UserCircuitCode packet followed by CompleteMovementIntoRegion packet. These 2 packets tend to occur one after the other almost immediately to the point that when CMIR arrives the client is not even connected yet and that packet is ignored (there might have been some race conditions here before); then the viewer sends CMIR again within 5-8 secs. But the delay between them may be higher in busier regions, which may lead to race conditions.
This commit improves the process so there are are no race conditions at the destination. CompleteMovement (triggered by the viewer) waits until Update has been sent from the origin. Update, in turn, waits until there is a *root* scene presence -- so making sure CompleteMovement has run MakeRoot. In other words, there are two threadlets at the destination, one from the viewer and one from the origin region, waiting for each other to do the right thing. That makes it safe to close the agent at the origin upon return of the Update call without having to wait for callback, because we are absolutely sure that the viewer knows it is in th new region.
Note also that in the V1 protocol, the destination was getting UseCircuitCode from the viewer twice -- once on EstablishAgentCommunication and then again on TeleportFinish. The second UCC was being ignored, but it shows how we were not following the expected steps...
This is so that other callers (such as SceneCommunicationService.SendCloseChildAgentConnections() can perform all closes asynchronously without pointlessly firing another thread for local closes).
No functional change apart from elimination of unnecessary chaining of new threads.
Unfortunately fails on Nebadon's system right now. Needs investigation. May put in a temproary option for experimentation soon.
This reverts commit d87ddf50fc.
On my own system, I can now eliminate the pause entirely and the reteleport happens whilst the teleport screen is still up.
Trying this change to see if this is true for other people.
This looks like an off-by-one bug since the view distance was already only 256 on the west and south sides.
This reduces the number of child agents being logged into regions neighbouring a megaregion.
This was because the calculation as to whether a new agent was needed in the receiving region did not take megaregions into account,
unlike the original calculation when the user first teleported into the region.
This meant that on teleport, entity transfer would create a new CAP but this would be ignored by the viewer and receiving region, meaning that the EQ could no longer be used.
This would prevent subsequent teleport, amongst other things.
Currently, regions up to 512m from a megaregion are considered neighbours.
This aims to reduce any side effects if the process tries to complete after the client has logged back in (e.g. it was delayed due to a slow destination region response).
This introduces a new Aborting entity transfer state which signals that the teleport should be stopped but no compensating actions performed.
If we do this after TeleportFinish, then it's possible for a neighbour destination to request the source to create a child agent whilst its still treated as root.
This closes the original presence which we don't really want to do.
This is probably okay (albeit with warnings on the console) but afaics there's no reason not to move the child agent signal.
Also adds regression test for the case where the viewer couldn't connect with the destination region.
Also refactoring of regression test support code associated with entity transfer in order to make this test possible and the code less obscure.
This option allows the simulator to specify that the cancel button on inter-region teleports should never appear.
This exists because sometimes cancellation will result in a stuck avatar requiring relog.
It may be hard to prevent this due to the protocol design (the LL grid has the same issue)
In small controlled grids where teleport failure is practically impossible it can be better to disable teleport cancellation entirely.
Previously, hitting the cancel button on a teleport would cancel on the client side but the request was ignored on the server side.
Cancel would still work if the teleport failed in the early stages (e.g. because the destination never replied to early CreateAgent and UpdateAgent messages).
But if the teleport still completed after a delay here or later on, the viewer would become confused (usual symptom appears to be avatar being unable to move/reteleport).
This commit makes OpenSimulator obey cancellations which are received before it sends the TeleportFinish event queue message and does proper cleanup.
But cancellations received after this (which can happen even though the cancel button is removed as this messages comes on a different thread) can still result in a frozen avatar.
This looks extremely difficult and impossible to fix.
I can replicate the same problem on the Linden Lab grid by hitting cancel immediately after a teleport starts (a teleport which would otherwise quickly succeed).