The Ricoh Company, Ltd Streamlining Development After Successful Outsourcing of High-Volume Annotation Work
[Interview with Mr. Odamaki, head of development, at RICOH]
Please give us an overview of your company's business and department.
RICOH focuses on digital services that complement office-related products and systems. I belong to the SmartVision Business Center, which was originally a digital camera department. RICOH has a long history of cameras, but around the time the iPhone came out, digital camera manufacturers were required to change their business model, and we embarked on the development of the THETA spherical camera.
THETAX, a 360-degree camera.
In 2013, we released our first camera called THETA. It was a new initiative that allowed you to capture the entire 360-degree surroundings in a single shot. Later, we added video support with the THETA-m15 and higher resolution with the THETA-S. This was followed by a breakthrough in the use of the camera in VR. Since its early days, the THETA has also been used in virtual tours of real estate, and even in recent years, its use has skyrocketed in the wake of COVID-19. Currently, the use of the system is still growing on the business side, with real estate and construction at the forefront.
What can the latest "THETAX" do?
The latest THETAX has a smartphone-like LCD panel for operation and viewing.
Actual THETAX; small size and easy to carry.
https://www.theta360.biz/ (THETA's official website)
May I ask about your career, Mr. Odamaki?
I was originally a hardware engineer and involved in the development of copiers, but there was a period from 2014 to 2016 when I was a visiting researcher at Columbia University for about two years. There, I improved my computer vision skills, and I am now working to apply these skills to technical development.
Is it about time for the "THETA_S" to break out?
Yes, there was already a THETA, and everyone in the industry knew of its existence, which was helpful in many ways. The idea of using multiple lenses to create a 360-degree image has existed in academia since around 1997. The technology that combines camera technology and image processing technology in this way is called computational photography, and one of the creators of the idea for the 360-degree camera was Shree K. Nayar of Columbia University, who is a leading expert in this field. I went to that institute for about two years.
So there was a precedent model?
The THETA, is capable of capturing 360-degree omnidirectional images and generating video.
In the early 2000s, there were already examples of 360-degree cameras and their application to real estate. Even today, real estate is still the most important area of application, and we are developing things like automatic cropping, staging AI, and super resolution for real estate. We are also creating technology to automatically generate video from 360-degree images.
So the video is generated automatically?
Yes, it does. This is a service that creates a camera path to make a room look larger and generate a video.
What kind of functionality is AI Staging?
When selling used real estate, we stage the property with furniture and take pictures of it, but it’s difficult for real estate companies to buy furniture and bring it in, so we put furniture in through CG, which is virtual staging. This is a fully automated service. This is where AI is needed. For example, it would be strange if there was a sofa in the hallway. A sofa in front of a door is also strange. So, labeling doors and windows is necessary. Based on such data, we are developing our services using AI.
You are currently the head of the office, right? How many people are in your department?
There are less than 10 people, not so many. The actual research is carried out in collaboration with RICOH's overseas laboratories and a venture company in the United States.
Are all members of the team involved in THETA?
Basically, yes. We are using machine learning to develop super-resolution for the THETA, and have presented our work at top international conferences, so I think we are developing at a high level.
You have done furniture labeling and segmentation with us, how is that data being used?
We mainly use your annotation data for AI staging.
What made you decide to use harBest?
It was resources and speed of the annotators. Other than resources and speed, it was the lack of bias. Researchers have biases, for example, you don't want to label images that are different from your hypothesis or don't match your intention. It takes a lot of time, and the results are usually poor. In these cases, I wanted to have an annotator who has no bias in labeling and to do it quickly. Although it costs money, I like the fact that we can have a flat discussion based on the results and that the process can be properly executed.
Thank you very much. What were your thoughts of our services?
We recently requested for object detection annotations from a company overseas that didn’t go too well. We were having trouble because the specifications for fireplaces and kitchens differ between Japan and other countries. It was very helpful that APTO was able to handle it quickly. Since it’s just annotating tasks, I am sure it could have been done by anyone, but speed is the most important thing for us. In terms of accuracy, we are not that particular about it in this use case. There are only a limited number of times when high accuracy is needed, so I felt that harBest’s speed was an advantage.
Are you experimenting with various tools?
We are experimenting with various annotation tools. My style is to use what is available in the world as much as possible. The initial work is often done by the researchers. Later, when it becomes okay to increase the number of annotations, we often ask for APTO’s services. If the annotations are simple, it is a waste of time to have the researchers do it, so that is when we request for harBest’s services.
Did you have any dissatisfaction with what was delivered?
I don't have any complaints. While this issue isn't related to your company, I find it challenging to articulate during the ordering process. The labeling process involves labeling items that haven't been labelled previously, leading to increasingly specialized tasks. For instance, labeling common objects like cats or dogs is straightforward and doesn't require much explanation. However, more intricate items present greater challenges in verbalization and explanation.
Are there any points you would like to see improved in the future as a tool?
I think it would be nice if it were easier to modify and explain. In fact, I would like to make it a little easier to submit various kinds of information. For example, the overly specialized labeling I mentioned earlier. If you ask the general public to choose a urethane-sprayed wall, they will say, "What is it? What is it? What we really need and want to provide is such maniacal data, but it requires communication, so I think it would be good if we could improve that.
I think it would be nice if it were easier to modify tasks and provide an area for deeper explanations to annotators. For example, for overly specialized labelling, like I mentioned earlier, if you ask the general public to label a urethane-sprayed wall, most will wonder what that even is. We would love to provide more in-depth information and material for annotators in the future.
What are your visions or goals for the future?
In the future, our vision involves developing a 360-degree panoramic camera capable of capturing everything in its surroundings. However, viewing such detailed images may pose challenges for people. We aim to enhance this technology by enabling machines to analyze and interpret these images more effectively. This includes implementing machine learning for classification, simulation, object detection, and labeling, empowering machines to process visual data with precision and efficiency. Ultimately, our goal is to deliver enhanced value to our customers by leveraging machine capabilities to perform tasks traditionally done by humans, thereby optimizing workflow and outcomes.
Mr. Odamaki, thank you very much for your valuable time today!
APTO will keep updating our services so that we can deliver even better data!
関連事例
-
How Leading AI Vendors Handle Essential Training Data for Generative AI
LightBlue
- AI Development (Experienced)
- Annotation
- Data Management/Labeling
- IT/Internet
- Annotation
- Data collection
- Data Management
- Experienced
-
Micro Control Systems: AI “Visualizing” Factories to Enhance Manufacturing
Micro Control Systems
- AI Development (Experienced)
- Annotation
- Data Management/Labeling
- IT/Internet
- Annotation
- Data collection
- Experienced
-
harBest Boosts MiiTel’s AI Speech Recognition
RevComm Inc.
- AI Development (Experienced)
- Annotation
- Data Management/Labeling
- IT/Internet
- Annotation
- Data collection
- Experienced
-
Challenges of Developing “LHTM-2”, a Large-Scale Language Processing Model From Japan.
alt, Inc.
- AI Development (Experienced)
- Annotation
- Data Management/Labeling
- IT/Internet
- Annotation
- Data collection
- Experienced
-
AGRIST Interview: Developing Agriculture Technology to Support the Aging Farmers of Japan
AGRIST
- AI Development (Experienced)
- Annotation
- Data Management/Labeling
- Agriculture
- Annotation
- Data collection
- Data Management
- Experienced