BudgieTalk: Animal Translator (Beta)

We're learning numbers
  • "Man has much power of discourse which for the most part is vain and false; animals have but little, but it is useful and true, and a small truth is better than a great lie" 
  • "... They will hear every kind of animals speak in human language. They will instantaneously run in person in various parts of the world, without motion. They will see the greatest splendour in the midst of darkness. O! marvel of the human race! What madness has led you thus! You will speak with animals of every species and they with you in human speech. You will see yourself fall from great heights without any harm and torrents will accompany you, and will mingle with their rapid course.- Leonardo da Vinci

This blog post is about how I somewhat inadvertently developed a way to translate my birds vocalizations into words and how this methodology could be used as the basis for an animal or interspecies translator. I dont have the programming skills to develop it further but I can present what i've been able to do so far. I'm working on a video to demonstrate and describe it better.

*(I'm currently updating this blog so it's not fully edited or done.) *

I'm going to go over this in such detail so those who are interested (and have birds or other vocal animals) can try it but also to show that it's not being faked or fabricated. Obviously most of the dictation isn't accurate but it's just a proof of concept and a starting point to something that will become more accurate over time.

The purpose is to use this process/ program to differentiate the different vocalizations into words we can recognize - the next step is to understand what they mean.

It can also be used as a teaching program. It essentially works as both a language translator AND generator. It will hopefully evolve into the development of a new universal language which other species will be able to learn and then communicate and translate their own vocalizations through.

Background Information and Animal Communication Potential:

These “SignAloud” gloves developed by UW sophomores Navid Azodi and Thomas Pryor translate American Sign Language into speech and text. - Source
This is Amy from Michael Crichton's "Congo" She was equipped with a backpack that allowed her sign language to be converted to human speech. (Fiction)

This technology and KoKo could make "Amy" a reality. Fiction becomes fact.

This is Darwin, a dolphin from the series Sea Quest. Darwin lived in a tank on their ship and was equipped with a device that would translate his squeaks and whistles to english. (Fiction)
  • Software has performed the first real-time translation of a dolphin whistle – and better data tools are giving fresh insights into primate communication too. - Source

In the 1960s, Margaret Lovatt was part of a Nasa-funded project to communicate with dolphins. Soon she was living with 'Peter' 24 hours a day in a converted house. Christopher Riley reports on an experiment that went tragically wrong - Source

Margaret was making a lot of progress in the few weeks she was able to work with him. The only limitation was lack of time and the dolphin's inability to vocalize properly using it's blow hole- not it's in-ability to understand or desire. If the right software were developed there is little doubt a dolphin could communicate. Or if a human could learn to whistle really well. 

Info on Animal Communication:

  • Puck, a male budgerigar owned by American Camille Jordan, holds the world record for the largest vocabulary of any bird, at 1,728 words. Puck died in 1994, with the record first appearing in the 1995 edition of Guinness World Records.[28][29] - Source
  • The Semantics of Vervet Monkey Alarm Calls: Part I - Source

Speech to Text:

STT: Is when you talk and the computer turns those sounds into words. This is what Siri and Google Now do.

Text to Speech:
TTS: is when the computer takes text and then with it's computer simulated voice - speaks those words.

Commands are words and phrases that the computer recognizes and then executes something. "Check my email" would open your email program. "What is the weather" would show you the weather on an app.

Dictation is when the computer will attempt to type out the words that you speak.

Voice Macro:
Is type of program that allows you to program in certain words and phrases that will then execute a command or do something else. The computer will listen to for words and then write them out. If these words are recognized as a command it will then execute that command when it's recognized or said.

Sound Recognition:
This would be the ability for a computer to recognize a specific sound. Eg: A doorbell or a specific song or tv episode or a car engine. There are apps that you can turn on and it will listen for a song and then take you to it so you can buy it. The same could apply with other sounds but is more difficult due to the inevitable variations between the sounds of the same things.

Sound Macro: 
This would be the same as a Voice Macro but the computer would perform an action when it heard a sound rather than a voice and words. So you could program it in to turn on a light whenever it heard the garage door open or turn on a security camera if it heard a car drive up - stuff like that.

Voice and Text Translation 
Uses a computer's ability to translate words from different languages and speech to text. So if you say "Hello" and have the computer set to recognize english and then translate it to spanish it would then repeat back "Hola" This enables two people who speak different languages to talk into their phones and have it translate and then speak out a different language - and another phone hear that language and do the same. Essentially a digital translator. Being able to do this with any language would make it a "universal translator."

Sound Translation: 
This would be the ability for a computer to hear a noise and then recognize what it was. So if it heard a door bell it could say "Doorbell" or a dog bark or a cat meow.

Animal Translator: 
Now we get to the fun stuff! What if we could combine everything above into a program that would be able to not only recognize the different sounds that different animals made but differentiate the sounds within a single species> not only that but to then translate it into english? (or any other language) That is the basis of BudgieTalk - with Budgies/ Parakeets as the starting point to develop the program and methodology which could then be applied to others.


A little over 2 years ago I got my lil English Budgie Danny Cooper with the #1 goal of teaching him to talk. Unfortunately he, is a she - and SHE does not like to talk. I probably said "Hey baby" 2.3 million times and she just gives me a funny look in silence. She would only sometimes chirp along to some music and very very very occasionally do what sounded like mumbling in my ear. It sounds like someone speaking english but while covering their mouth and talking reallllly quietly.

You see, girl budgies are less likely to talk and if they do they do so less often, sometimes, like Danny, not at all. That has not been my experience with human women, but that is besides the point. The point being - I had to get another Budgie - a boy this time! Enter - Dylan Pryce. I had him DNA tested and got him as soon as he was weaned - as young as you can. He was friendly, loves me, he's adorable but honestly and unfortunately - he's a little 'special' - probably due to the in-breeding of the "English" variety of budgies. They make awesome company but I was still left with my primary goal of teaching one to talk being not met - at all. Again he would sometimes mumble in my ear and sometimes talk a little in his sleep.

This was not for lack of trying. I would constantly repeat phrases - talk to them as if they could understand me, make them watch educational videos, baby learning courses and even watch other talking budgies on youtube - over and over. Nope. Not going to happen. They wouldn't even make normal budgie vocalizations. Silent treatment. That's not to say they are dumb (well Dylan is a little) but Danny is obviously smart and seems to understand me to an extent - she just won't talk back.

That is her playing the piano - see - SMART - just not talkative. 

So I was like - ok, i'm going to have to try again but this time i'll get a regular parakeet from the pet store - not a baby that never learned to talk but one that already could at least talk like a bird. Then even if they don't learn to talk they will at least have babies and we can teach them to talk bird talk AND english. 

Fast Forward

and I have 5 birds - no babies and these little flockers will NOT SHUT UP.

  1. Danny (girl) (English Budgie)
  2. Dylan (English Budgie)
  3. Luca (girl) (Parakeet)
  4. Matt (Parakeet)
  5. Bradley (Parakeet)

Thankfully they will not make a peep, no matter how long I sleep in, but as soon as i'm up - they are screaming until I go to bed. They even talk in their naps! Thankfully i'm able to either tune them out or put them out in the atrium.

When I say talking I mean parakeet talking not english. Or at least - that is what I thought...

A few months ago I was working on a project that would allow me to Macro advanced photoshop commands. I realized I could assign actions to certain keys so if I had something that required a lot of steps/ button presses/ etc I could execute them with a press of a button on the keyboard and it would do the action automatically. HUGE timesaver. That was until I ran out of keyboard buttons. Not only that but I was having a very hard time remembering which key did which command. 

So I thought to myself - you know what would be really cool? If I could just say a command and it would do it! That's when I found VoiceMacros. They are programs that like photoshop actions - you can program other things to happen based off a single command. These commands could be your voice! So instead of having to press Shift-F2 to start a macro that would change the image's size I could just say "change size" and the computer would press Shift -F2 and then yadda yadda change the size or whatever else I had programed it to do. I could say "Merge Layers" or "Brighten" "Save" etc and I wouldn't have to press a button or memorize anything. This isn't necessarily faster than pressing a button but like I said there were not enough buttons on my keyboards for all the different macros I was making and I didn't have to remember which one did what. I could just say what i wanted and it would happen. 

This was the first time I had done any kind of programming and opened up a totally new world. In a couple days I had hundreds of commands that could pretty much operate my entire computer verbally. "Open google" "Tell me a joke" "run a virus scan" "check the weather" "check my email" "play music" - almost anything I could think of I could assign a voice command to it. 

It was nothing less than amazing and i'll probably post about it another time. There are a lot of little technicalities that went into it such as teaching the computer to better understand your particular voice, the quality of the microphone, and working through pronunciations and stuff like that. One thing about voice recognition software is that it has to be tuned to your individual voice and this requires reading some sentences over and over but the more you use it - the better it gets. 

Here is where the little problems started to happen. When I first started this I had two rooms - an 'office' and a sleeping room. My birds were in the sleeping room. I had put a hole in between and was working on teaching them to go back and forth but that's another story. I decided to combine my rooms together and this meant that I would be cohabitating 24/7 with my now very talkative flock. Not only do they like to talk they HAVE to talk whenever I am talking. You see how that makes trying to program your computer to listen to you gets complicated when you have really loud birds singing in the background. 

This was also around the time I purchased a new microphone - a fairly nice one since my web cam' mics were just ok. This upgrade made the speech recognition 10xs better - it wasn't that I wasn't talking clearly but that the microphones I was using were just not that sensitive. This new mic was so much better that it began to pick up the noises of me typing on my keyboard and trying to turn them to words. The sound my chair would make if I leaned back would be picked up and converted as well. I realized the words it thought the keyboard typing sound made could be turned to commands. So whenever I would type (depending on how long i typed for) it would think the typing as someone saying "two" or "two two" or "two two two" (I forgot the exact words) but regardless I could then program the Voice Macro to say "you are tying' whenever it would hear "two" - which was really the keyboard clicking as it typed. This turned it into not just a Speech to Text program but a Sound Recognition program since I could program the sounds it translated to words into macro commands. Cool, but not too helpful. It could say "you are typing" or I could program it to do anything else anytime I would type. 

So you can imagine what happened when the birds moved in... 

Their bird babbel ended up being a lot more varied, according to my computer than I could ever imagine. Actually it wasn't that I couldn't imagine it - it was what i was expecting and had been trying to teach them - i just didn't expect it to work as well as it did. I had thought they just kept making the same vocalizations over and over but my computer, and my new microphone which could hear better than I could - told me other wise. 

Unlike my loud typing keyboard which would elicit maybe 3 different differently dictated words the birds were generating dozens, hundreds, and potentially thousands! You see it's not just sounds being interpreted as a word but sounds being interpreted as phrases and sentences and paragraphs. 

I watched with utter amazement as my computer started to type out pages of transcripts that it thought my birds were saying. WTF. I expected it to try to interpret them talking as words just like it did with almost any sound but not that they were generating so many different sounds. Sound unique enough from each other that they were being recognized as different words - lots of different words - but but - some of which were the same words and phrases being said over and over. 

This repetition meant that they were not just random words being generated from random sounds but that they were making the same sounds consistently and that they were being dictated and differentiated from each other consistently as well. 

One way to think about it is to compare it to humans. If you took the average persons and documented / recorded/ dictated everything they said there would be some words and phrases that would make up a very large portion of what they actually spoke.

Good morning. How are you? Goodbye. Hello. Hi. What's up? Thank you. Please. Where are you? Okay. yes. no. 

etc. You could make a list of the most common words and expressions that make up the majority of conversations and these would be even more simplistic and common with children or people speaking with a lower vocabulary. I think it only requires like 500 words for the more simple languages and a good english speaker knows upwards of 15,000. There are even some whistle languages but i'm not sure how many whistles they have.

Anyways. I'll get into the specifics of how I programmed these in later on but the simplified version is this. When they vocalize the VM(Voice Macro) will make a list, a dictation, of everything it thinks it hears as words. If I were to say "Hello" it would show up on the list as "hello" or whatever it thought I said. "a low" or "hallow" etc. If I clap my hand it might think I say "get" or "at" - but it does't just come up with a random guess - these dictations are repeatable. So if I clap my hands again the same way it will come up with "at" or "get" each time. 

Like with any other voice macro I can program it to perform an action every time it hears/ recognizes the same words. In this case I can program the computer to say something if it hears a particular word or phrase. If it hears "hello" i can program it to say "hello" or if it recognizes "hello" i could have it say "hi, how are you?" etc etc. 

Since the birds make different vocalizations and these are dictated by the computer as different words and phrases I can then program each new word and each word from each phrase/ sentence to be recognized and then spoken back. 

So if the bird's chirping makes the computer think it heard someone say "each" I can then add "each" to the dictionary and then have the program say "each" back, in english every time it processes "each" again. This is essentially turning the bird's talking into english translations - even if they are not actually speaking english words. Don't forget that they ARE actually capable of speaking english words! But for now that wouldn't' matter. 

As the program is open and listening it will continue to attempt to turn the sounds it hears into dictated words - speech to text or in this case sounds to text. Since the birds are making the same sounds and the same sounds in different combinations the list of dictated words becomes a dictionary from which I can take from and program in these words and phrases to be said back in english if and when they say them again. 

What I began to notice was they were saying the same sounds/words over and over but in different combinations.

Each much
Each you much much
Each you you much much you much


Much much each you much you you
You much each
Each each each

Each string of words becoming phrases and 'sentences' which would also have to be added to the dictionary. 

A problem is that often times they don't pause between talking and will go on for a long period of time during which the computer doesn't pause either. So when they finally do take a break that's when it shows what they have said. These longer strings of words are probably not going to be said again but sometimes they are filled with NEW WORDS, which can then be added to the dictionary individually. So if they say them again they will be recognized. In this way you can develop an evolving phrase book of their sound/words which after being programed in with be spoken back to you in english if they say them again even when you are not watching the dictation panel. 

There will be a lot of  'junk' language or mistranslation or thing that will only be said once but eventually there will be solid and verifiable translations that can even be extracted for meaning. 

The budgie versions of "hello, where are you, how are you, their names, their words for food, etc etc" 

These vocalizations - to me all sound the same and i am unable to physically due to the limitations of my hears (hearing range) concentration and the speed at which they vocalize to differentiate between anything they say. However using this method and the computers help I can start to detect patterns in the english words and phrases even if they are not what they are actually saying or even if they are not traditional words - they are definitely language. When the sound they make "chirp" gets translated to "each" that doesn't mean they are really saying "each" but now i am able to use that as the basis to understand what 'chirp" really means because I can now differentiate it and see what they are doing when they say it. 

Sound Waves:

Essentially they are hearing and then duplicating sound waves. Some of these sound waves are not recognizable by our human ears but that doesn't' mean they are not meant to be a duplicate of the intended sound wave. They could be saying "hello" but too quickly or not within the freq we can hear - but the sound wave itself- what the computer processes is close enough to "hello" to be processed correctly. Almost like trying to listen to a language you barely understand sped up or being color blind. I've sped their talking down and it sounds a lot more like 'talking' than at normal speed. 

The biggest roadblock to most animal communication isn't their intelligence or ability or desire to communicate but their physical vocal limitations and our own patience and mental/ ear's limitations. 

This methodology (via technology) helps to fix those challenges and I believe with the right people and animals behind this idea it could quickly lead to a lot of childhood dreams being fulfilled. The ability to talk to animals and have them not only understand you but talk back. 

Learning & Teaching

BudgieTalk also acts as a teaching device since the computer will be speaking back to them over and over - which is all it takes for them to learn new words. If you say "hello" over and over eventually they will learn to say "hello" so with this process, even if their vocalizations are not corresponding with the word being dictated - it is then spoken each time - essentially teaching and translating that word anyways. So those accidental words - become the word/ meaning they were speaking. 

Eventually the program can be used with other birds which has already been developed which will then teach them the same words/ phrases and process - adding to it, and then further refining the language and technology. Birds teach each other to talk as well as people teaching birds to talk so all it takes it showing other birds videos of birds talking - and they will learn the same words.

Or in other words you show a video to birds of these birds using BudgieTalk then let them use BudgieTalk and they will be able to learn on their own with the program. Then their own BudgieTalk profiles / recordings can be shown to the original birds and they will hybridize and so on - increasing in sophistication and elaboration with time and the amount of birds using it.


Setting up Microphone. 

Logitech C920 (&910) are great, they are the ones that I use. Any webcam will work as long as it has a microphone.  

You will need an attached microphone. A webcam will usually have one. There are USB microphones or a 3.5mm type attaching kind which plug into the mic jack on your computer, usually pink.

usb -$43.99
I have this Blue - Snowball - it's really nice and can work for other purposes as well. I had to get another so I could record two things at once.
3.5mm $14.99
 The 3.5's are usually cheaper.

This is a best selling one on amazon for only 7.99$ (haven't tried it myself)

Using your phone as a wireless microphone:

You can also get an app for your phone that will let you use it as a microphone that your computer can recognize over wifi/ bluetooth.: Using your phone as your microphone - Directions

Selecting your microphone as the default device:

Click on Windows key | Type in: "Sounds" or "Change System Sounds"
Click on the device/ microphone you would like to use as the sound input and set it as the default. 
Alternatively you can Right Click on the Sound icon in the bottom right hand corner of your screen. Select Recording Devices then do the same. 
Right Click on the Sound Button on the Bottom Right hand corner. 

If it's not already select the microphone you want to use, then click on "set Default"


The text in black are dictations that are new and not recognized. They are what the program thinks it's hearing. The text in green are words and phrases that used to be black but have been saved into the program so that they will speak back those words if they are recognized again. For example if the computer translates their vocalizations as "you you" and it has been said before and programmed in - it will then say out-loud in its computer voice "you you" the next time it's recognized. 

Install Voice Macro

Direct download

Installer: VoiceMacro_1.2.2_Setup.msi
ZIP file: VoiceMacro_1.2.2.zip

The Installer version will install it like any other program. Download it - start the install. Open like any other program. Windows - type in or find: "Voice Macro" Open the Program.
The Zip file needs to be extracted.

  •  http://www.7-zip.org/ (Free Unzipper) 
  • Extract to a folder on your desktop or harddrive. The Icon to open Voice macro will be in that folder. Click no it to open the program. 

VoiceMacro is an awesome program that uses Windows Speech Recognition and allows you to easily program your computer to understand different voice commands. For example you can train it to recognize "Open Google" and then it will open your browser and then goto google.com Or "Minimize Window" and it will minimize your window etc. It's really great for creating shortcuts in photoshop and other programs but i'll save that for another post. 

What we want to use VoiceMacro for is to get it to recognize the bird's vocalizations and translate it as text - then to take that text and have the VM translate those back into words.

Eg: If the bird makes a "Chirp" sound the VoiceMacro will be listening and try to turn that "Chirp" sound into a word. Let's say that it thinks "Chirp" is "Each" so it will write out "each" whenever the bird makes the "chirp" sound. You can do the same thing with your own voice by saying "Hello' and it will write down "Hello" or whatever it thinks it hears. The next step is to program it to DO something every time it recognizes "hello" or in this case "chirp"/ "each" 

What we do next is to keep the program open and allow it to listen to the birds vocalizing. While it does this it will keep trying to recognize words from those sounds and display them as black words and phrases/ sentences. 

This text is called Speech to Text or in this case it's bird sounds to text. Now we can take that text and program it into voice macro as phrases and words that will be spoken by the computer when they are recognized.

Eg: If we say "hello" the computer would show "hello" then we program it to Say back "hello" whenever it hears "hello" acting ironically like a "parrot' that just mimics back anything you say - BUT it has to be programmed into the VoiceMacro program first. 

There is a wide variety of things we can program it to do when it recognizes a certain word/ phrase but in this case we are simply going to see what the computer recognizes the birds sounds to be - then program them to be said back.

So if the bird makes the "Chirp" sound and it's recognized by the CPU as "Each" we will then program Voicemacro to Speak "Each" which means that every time they make the "chirp" sound the computer will say "Each' 

The thing is they don't just say "Chirp" they say "chirp chirp" or "Chirpppp Chirp Chirrrrrppp p p p" etc  etc etc. They basically sit around all day and come up with new combinations of sounds using the same basic sounds and every once in a while using new sounds and then new combinations of these sounds over and over like children do as they babbel and learn to talk. 

Obviously the majority of their vocalisations are not english words or phrases but they are repeated and not always random. This means that while "chirp" does not literally mean "each" the computer will fairly accurately say "each" every time they say "chirp" which enables us to discern and recognize their vocalizations since they are usually too fast and too high in pitch to hear properly or tell the difference. To me they just sound like they are making the same sounds over and over but if you listen carefully they are making multiple vocalizations and in different tones, repetitions and patterns which the computer CAN recognize and tell the difference between. 

All we are doing is programming the computer to recognize and label the different vocalizations and to then speak them back when they repeat them again.

Unfortunately I have not found a way to automate the process and have to watch the list of words and phrases it thinks it heard and then add them in manually. Someone with better programming knowledge could probably fix this but for now this is working surprisingly well. 

Again, this is not perfect translations of them speaking english - the majority are just sounds being turned to random words. BUT since budgies can potentially learn hundreds of different words that we can recognize and understand it stands that the computer could do the same (since it can recognize us speaking english words) and some of these vocalizations are in fact - english words. These are then occasionally processed correctly even if we are not able to hear them correctly. In other words they are actually saying "Hello" or "eat" or "sing" or "one two three" but too quickly for me to recognize but not too fast for the computer. 

Keeping in mind that almost every single word and every single phrase were generated by the birds vocalizations and some of these vocalizations could have been english words- some could have been accurate and actually processed correctly. For example when one of them starts eating and "eat" or "eat much to much" comes up it's likely that it's not a coincidence and that one of them said "eat" and the computer recognized it as "eat" in the same way that if i said "eat" the computer could recognize it as "eat". 

It would also be no surprise that they knew the word "eat" since I say it everyday when I feed them. I've been actively talking to them, in context, obsessively for years.  I have them watch baby and toddler learning videos and they watch clearly talking budgie videos on youtube all the time. 

Some words could be mis-heard: A low is "hello" Or they may not be able to pronounce certain sounds so they could be saying ood instead of food. 

The main goal besides just being able to differentiate their vocalizations is to assist in teaching them to talk. When the BudgieTalk is on it is constantly talking and repeating phrases and words that they themselves are controlling - even if it's only inadvertently they will eventually realize that when they make certain sounds - they will cause certain other sounds to be made. Then they will hear these english words and start to learn them and say them back - creating a type of teaching program. 

Another aspect is that even if they are not speaking english words or phrases the sounds they are making do have some kind of meaning. For example when one of them is in another room they always make the same sound to call to them. This sound will be turned into an english word or phrase (even if it's not in english) and that phrase can then be translated into it's english equivalent. So if their call sounds like "chhhhirp, chiiiiirp!" and the computer things it is "each, each" once we realize that it 'means' "where are you" or "where did you go" or "come back" we can program the computer to say those phrases - essentially acting like a translator.

 It's already well established that animals, especially birds, will use specific calls for specific things and instances. They have their own calls for each other, for different foods, for water, for danger - for different dangers etc. It would be fairly easy to record these sounds and identify what they are referring to and then have the computer say / translate/ the meaning. "there is food" "There is water" "there is a dog" "there is a human" "there is a dangerous human" etc. 

A program like that could be relatively easy to develop and then just altered for different species. It only requires some (probably) simple programming. The real challenge would be in the observation and then recording to relate those sounds/ vocalizations to the behavior. That would be obviously difficult to do with animals in the wild, whales, birds, since they are rarely localized or stay in one spot. This is why birds in captivity would be perfect to develop and refine the technology first. 

BudgieTalk is essentially a prototype of a Universal Translator. The basis of which would be simply recording and recognizing different sound waves that can then be labeled and then translated into another form that the observer could understand. Since i'm unable to program, or to find a program that does that I used Windows Speech to Text and VoiceMacro and their ability to turn sounds into text - as a surprisingly effective alternative. 

To do this properly would require recording them and then manually looking at each sound wave (or listening with headphones) and then corresponding each to another time they made them and what they were doing. Until I can figure out how to have the computer automatically add the sounds and phrases i'm going to have to do it manually. I'm also going to have to devise ways to discern the meaning to then assign them to the noises. 

For example their top/ most used words are:

Six (i've been trying to teach them numbers)

So it's likely that each of those words don't mean their english equivalent but are their words for each other, for me, for emotions, and things like that. For example "each" means "Where are you, come here, over here, come, where" and seems to be a call to "come back" so it's not that they are saying "each" but that their call sounds (to the computer) most like each. What I can then do is everytime it recognizes "each" to say "Where are you, come back, over here" and then see if it corresponds to what they are doing when they say it - it does. 

I have to be able to see what they are doing and understand it when they make those sounds and then program it in the computer and then verify it later. 

They are probably saying

 "Come here, kiss me, stop, go away, come back, i am hungry, where are you, I love you, I hate you" 

Over and over and over in different ways. They are like little crazy kids on speed.

  • The vocalizations aren't always random and unique. There are the same sounds they make over and over. "Churp" "Tweet" "Chhh" as examples.
  • The computer's Speech to text will interpret the "churp" into the text "each" even if they are not actually 'saying' the english word "each" the computer will still be able to recognize and translate it the same each time.
  • So every time they make the "Churp" noise the computer will recognize it as "each"
  • Then "each" can be programmed into the computer as a Voice Macro to say outloud (Text to Speech) "each"
  • This means that when a bird makes the "churp" sound the computer will then speak "each" acting as a translator.
  • With that same basic principle an entire word/sound/vocalization library can be built up.

"Churp" isn't just used as a single sound but can be modified enough that the computer will recognize it as another/ different word. So "Churrrrp" and "Chhhhhhurp" are recognized as the english words "each" or "much" adding even more words to the dictionary.


They do say the same words over and over but usually they are using long strings of words in different combinations.

"Churp Churp" or "Churpppp churp Churp"
or "Tweet Tweet Churp"
or "Tweet Churp Churp Tweet"


Adding a Macro

Set up: (after installed and app is open)
Add a New Profile
Add a macro "Hello" then program it to Speak "hello"

Importing Budgie Talk:
I'll provide the file that you can import to get all the words we've developed soon.

This is what the program looks like. You click on "Add Macro" to add a new word or phrase. 

Then you type it in. This is the word or phrase that the computer thinks it has heard.

Then you click on "other" then on "speak text" Then enter in the same word or phrase. Now next time it will speak that word or phrase if it's recognized again. 

How to test if it's working:

One way to verify if they are understanding me and if it's working is to tell them to say something - and then it shows up. I could say

"okay can you guys count, can you say 12345678910??"

Then within a short time numbers would start to be said. Not just random numbers but number in order - this has happened a few times followed by random numbers and words combined. Like kids they start to combine new sounds and put them in different orders and play around with them. This isn't any different than saying

"Say hello, who's a pretty bird" and they say back "Pretty bird, helllo"

Then if the computer was able to even get a single word "hello" or "pretty bird" right after you told them to say it - it would verify that the computer was working and translating and they were speaking some english words and that some of it is intentional.


 Intentional is slightly different than contextual. Repeating back what you say is one thing but saying something on their own that is related to what they are doing - in context - is another. An example of this would be me not saying anything and one of them starts to eat and then "eat" or "eating" or something to do with the word 'eat' shows up. That would mean that they are not only speaking in english but speaking contextually and without being told to. This has happened at least twice. If in 1000 words and phrases that come up none are "eat" then all of the sudden one of them starts to eat and "eat" shows up - it would be less of a coincidence and more of verification. "Singing, sing" when music is playing or when they are singing is another sign of context. It's not that they are mimicking the word "Sing" when I say "Sing" the computer will actually show 'Singing" "You Sing" "Sing much" and other variations with the word 'sing" being used - none of which i've taught them and being said on their own without any prodding. They are much more likely to say a word on their own later NOT when I try to get them to. 


There is the added complication that they usually don't speak one at a time but back and forth and over each other - which probably confuses the computer. I'm working on a way to keep them separated so i can record one at a time but then they won't talk as much so I have to have a way that they can hear each other but then the camera won't pick them up at the same time. 

Games and Aids:

  • Make flash cards that are associated with different words/ sounds. 
  • Hold up # of finger to say numbers

Future Topics:


Popular posts from this blog

Velociperception & Ainan Celeste Cawley

Salvator Mundi (The Savior of the world) by Leonardo da Vinci? (Update)

What did Leonardo da Vinci Look like?