Larkin’s crazy ideas #3, 2015.01.18.a. The LarkinLiveHolodizer

LarkinLiveHolodizerFCAL crazy idea #3: 2015.01.18a – The LarkinLiveHolodizer

Hi all,

The LarkinLiveHolodizer is a system for transmitting live acting performances to the spectator to his favourite VR device, may it be a holodeck, VR glasses, stereo retina projection or direct nerve induction or whatever. Now: What is the difference to animation or tweaked recording? It is live. You can enjoy the live performance of a real actor as if you were sitting right there in the first row. Watching his performance, his emotion, even his glitches – without the need of being there – and without the limitations for the actor. Think like this actor. You perform and several million people are sitting in the first row. With this technology this is possible. The future is bright!

Now: how does it work? The image illustrates the process. Klick on it and look at the letters A, B, C on the image.

Pre acting scanning and auto-rigging of the actors (like you know it from your kinect or xtion t-posing. During that process the distributed computer systems find the skeleton positions inside the actors and scan their bodies (using a sufficient amount of 3d cameras). There are systems using 32 normal optical single lense cameras using photogrammetry to create 3d point clouds and meshes. But if you could think of using 3d cameras from the beginning each precalculating the 3d information and then transferring combined 3d guesses to the computer calculation farm. Alternatively to 3d cams you could use lensless supercheap cams in huge amounts (1 square cm per camera combined to a completely surrounding curtain using thousands or 100thousands of cams and using photogrammetry of the calc farm. The calc farm then creates point clouds and then meshes, does a retopology calculation to get a suitable quad mesh for the actor’s bodies. Challenging points might be the dynamic behaviour of soft stuff like cloth – challenging to distinguish between flesh and parts moving like flesh from parts behaving different like cloth or static stuff like helmets, shields, sticks etc. Then the system does an automatic rigging, automatic weight painting and combines the different physical materials using tricks like deform cages, cloth simulations, rigid body simulations etc.

A human rigging specialist should observe the actor’s movements during a playback of standard movements with the auto rigged actors. The advantage of this process is that the actor can use different outfits and a different makeup each time – so it is somewhat like live. A variant of this might be that you don’t use rigged skeleton based models but just transfer point clouds or meshes from that. Although the amount of transferred data would be much greater.

When acting the cams simply detect the movement of the actor’s bones and transfer the information to the calc farm.
The calc farm creates the 3d models pre-acting and just animates the models during the performance. It is important to have a large amount of cameras to get good results especially when the number of actors is large. The calc farm also converts the 3d performance to different output formats for different end user devices. Those can be: passing the information on to a realtime render farm each virtual machine calculating some pixels and passing it to the system back as two equirectangular images which must be perfect for 360 degree projection (there is a eyes-rotation problem with this which we solved and will explain separately, otherwise you will get wrong 3dimensional results when turning your head too far). Another way of output might be the feed to a game engine able to process textured meshes with normal maps, specular maps etc. It is probably wise to do all the cloth simulation calculation in the calc farm ( B) ) so that the game engine doesn’t have too much to compute. So in the worst case you would deliver baked keyframes of the complete mesh per actor to the game engine of the spectator’s client software which means there are solid requirements to the capability of transferring a large data stream to the client. Anyway you will need to find a tradeoff between all those parameters, possibly dynamically switching between them like it is done in dynamic adjustment of video quality in today’s video streams.

The spectator’s device may vary. If it is 3d VR glasses today, autostereoscopic tv sets and domes tomorrow or retina projection or 3d nerve induction the next day – who knows. But the platform for live recording, recalculation and transferring it to different output formats will standardize and evolve to common systems like today we have things like 2d image autosmoothing and frame to frame interpolation in common tv sets today.

That’s it for today.

F.C.A. Larkin

License: public domain
Patents or other I.P.:unknown. – but I doubt that this might be worth a patent as it is straight forward usage and combination of today’s standard systems plus some technical trial-and-error advancements.