Will AI pass my aviation exam?

I’m both a technology geek and a pilot. Not surprisingly, when I got my hands on the Azure OpenAI service (in preview) and Azure AI Studio, I wanted to try it out on the aviation topic. Aviation is a highly regulated area of human activity, and many legislative documents define the operations. As I’m studying now for the CPL Air Law exam, I naturally decided to do a quick experiment to see if AI would pass the exam.

The idea is to use Azure Open AI services and Azure AI Studio to fine-tune a GenAI model to be able to answer questions by sourcing information from specific documents provided. I specifically wanted to ensure that there is no “hallucination”, i.e., coming up with facts that don’t exist.

OpenAI, ChatGPT and Azure Open AI

If you are confused about the difference between these – no shame. It’s fresh from the labs and changes all the time. My attempt to clarify it:

  • OpenAI is the company’s name, mostly known for its service ChatGPT based on the Large Language Model (LLM) GPT. It’s been recently in the news for the saga with the company CEO Sam Altman.
  • ChatGPT is a service that makes it easy to interact with such a model. It took the masses by storm with its simple chat-based UI and showcased the technology capability so well that it became the hottest technology topic in 2023.
  • Azure OpenAI is an Azure cloud service that hosts the GPT and other Generative AI models.

Setup

As I’m close to some friendly MVPs, I had early access to Azure Open AI service and playground Azure account. I have launched the following essential 3 services and a few other supporting them as below:

  • Azure Cognitive Search with Semantic Ranking
  • Azure OpenAI service (and its version Azure AI services)
  • Azure AI studio, as a part of the above service

In the Azure AI studio, I’ve deployed gpt-4 as the language model and uploaded for the test AIP book, CASR part 91 and the Part 91 Manual of Standards as the knowledge base. As a part of this process, the Azure Cognitive Search created an index used by the Azure OpenAI service to source the factual data.

Experiment

I ran a few queries on the aviation rules, and here are some results with the comments:

The answer to a basic question about VMC (visual meteorological conditions) is quite good – accurate and relevant to the Australian regulations. When asked the same, ChatGPT gives a more “conversational” answer and refers to FAA rules, statute miles, etc.

Another fact retrieval test – VFR cruising altitudes, looks good!

Well, retrieving facts stated is one skill, but I tried a very specific question that is not clearly defined in one place in the regulations. The below answer has been generated from different bits and is almost accurate as far as I know. Not bad!

I also tried to ask a question that is not defined in the regulations and left happy with the answer – there is no such thing:

Where are the limits?

The above quick tests look quite positive and I run a few more than listed. With some of these, I noticed that the more complex topics get simplified or summarised and the important bits of information are missing or not included in the original answer (while they can be added with clarification and detailing requests).

For example, I asked the question regarding the requirements for the carriage of emergency location transmitters (ELTs). The initial answer was good but not full.

As I realised that the constraint may be the response size limit (set to 800 tokens initially), I tried to increase it (to 2,000 tokens) and the highlighted bit of important information appeared!

As I knew there is something was missing, I asked to clarify the response with regard specifically to life rafts and ELTs requirements and get the updated answer:

Conclusions

Generally, the technology is amazing. Answering whether this AI would pass the Commercial Pilot Air Law exam (passing score of 80%), I can say yes. The accuracy is great with the right setup, the factual retrieval is excellent, and the summarisation of facts is fantastic.

This is definitely a technology to help customers find the right answers to their questions – I see huge application areas for small businesses and large enterprises. The cost of such a service is an important factor here, but it can stack up compared to the costs of a call centre (which I don’t like to talk to anyway).

At the same time, such a solution is limited when the completeness of information is essential. At this stage, it cannot be used for actual flight planning and operational decision-making in aviation or other areas requiring completeness, such as medical decisions and law.

As most of the services listed are in public preview and cannot be used for production solutions, I’m staying tuned to the updates on this technology and looking forward to how it may shape our future.

Original Post https://cloudminded.blog/2023/12/03/will-ai-pass-my-aviation-exam/

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Follow
Sign In/Sign Up Sidebar Search
Popular Now
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...