Aggressive Dogs Don’t Bite (Or Do They?)

If you rank dog traits by reported bites, the “dangerous” breeds are nowhere near the top — the biters are the confident, friendly, trainable family dogs. That sounds like a scandal. It’s actually a lesson in base rates, and this notebook walks straight into the trap before explaining the way out.

1. The Data

We join two datasets by breed:

  • Health_AnimalBites.csv — reported animal bite cases from Louisville, KY (Kaggle), with the biting breed recorded free-form.
  • dog_breeds.csv — breed profiles including a comma-separated list of character traits per breed.
Code
import pandas as pd
assets_folder = 'assets'
dog_breeds = pd.read_csv(f'{assets_folder}/dog_breeds.csv')
print(dog_breeds['Breed'].unique())
dog_breeds.head()
<StringArray>
[         'Labrador Retriever',             'German Shepherd',
                     'Bulldog',                      'Poodle',
                      'Beagle',                   'Chihuahua',
                       'Boxer',            'Golden Retriever',
                         'Pug',                  'Rottweiler',
 ...
 'Wirehaired Pointing Griffon',              'Xoloitzcuintli',
           'Yorkshire Terrier',                       'Akita',
                   'Africanis',                     'Basenji',
       'Catahoula Leopard Dog',         'Miniature Shiba Inu',
            'Belgian Tervuren',               'Pharaoh Hound']
Length: 103, dtype: str
Breed Country of Origin Fur Color Height (in) Color of Eyes Longevity (yrs) Character Traits Common Health Problems
0 Labrador Retriever Canada Yellow, Black, Chocolate 21-24 Brown 10-12 Loyal, friendly, intelligent, energetic, good-... Hip dysplasia, obesity, ear infections
1 German Shepherd Germany Black, Tan 22-26 Brown 7-10 Loyal, intelligent, protective, confident, tra... Hip dysplasia, elbow dysplasia, pancreatitis
2 Bulldog England White, Red 12-16 Brown 8-10 Loyal, calm, gentle, brave Skin allergies, respiratory issues, obesity
3 Poodle France White, Black, Brown, Apricot 10-15 Brown, Blue 12-15 Intelligent, active, affectionate, hypoallergenic Hip dysplasia, epilepsy, bladder stones
4 Beagle England White, Tan, Red, Lemon 13-15 Brown 12-15 Curious, friendly, energetic, good-natured Ear infections, hip dysplasia, epilepsy
Code
animal_bites = pd.read_csv(f'{assets_folder}/Health_AnimalBites.csv')
print(animal_bites['BreedIDDesc'].unique())
animal_bites
<StringArray>
[              nan,   'GERM SHEPHERD',       'DACHSHUND',        'PIT BULL',
        'SHIH TZU',  'COCKER SPAINEL',      'CHICHAUHUA',          'BEAGLE',
       'CHOW CHOW',           'OTHER',
 ...
 'TOY FOX TERRIER',      'RED HEELER',      'WEINER DOG',        'MALAMUTE',
   'IRISH SPANIEL',         'BESINJI',  'BEARDED COLLIE',     'STAN POODLE',
  'AMER FOX HOUND', 'IRISH WOLFHOUND']
Length: 102, dtype: str
bite_date SpeciesIDDesc BreedIDDesc GenderIDDesc color vaccination_yrs vaccination_date victim_zip AdvIssuedYNDesc WhereBittenIDDesc quarantine_date DispositionIDDesc head_sent_date release_date ResultsIDDesc
0 1985-05-05 00:00:00 DOG NaN FEMALE LIG. BROWN 1.0 1985-06-20 00:00:00 40229 NO BODY 1985-05-05 00:00:00 UNKNOWN NaN NaN UNKNOWN
1 1986-02-12 00:00:00 DOG NaN UNKNOWN BRO & BLA NaN NaN 40218 NO BODY 1986-02-12 00:00:00 UNKNOWN NaN NaN UNKNOWN
2 1987-05-07 00:00:00 DOG NaN UNKNOWN NaN NaN NaN 40219 NO BODY 1990-05-07 00:00:00 UNKNOWN NaN NaN UNKNOWN
3 1988-10-02 00:00:00 DOG NaN MALE BLA & BRO NaN NaN NaN NO BODY 1990-10-02 00:00:00 UNKNOWN NaN NaN UNKNOWN
4 1989-08-29 00:00:00 DOG NaN FEMALE BLK-WHT NaN NaN NaN NO BODY NaN UNKNOWN NaN NaN UNKNOWN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
8998 2017-09-05 00:00:00 DOG NaN NaN NaN NaN NaN 40243 NaN UNKNOWN NaN NaN NaN NaN NaN
8999 2017-09-07 00:00:00 DOG POMERANIAN MALE RED NaN NaN 40204 NaN HEAD NaN NaN NaN NaN NaN
9000 2017-09-07 00:00:00 DOG LABRADOR RETRIV MALE BROWN NaN NaN 47130 NaN UNKNOWN NaN NaN NaN NaN NaN
9001 2017-09-07 00:00:00 DOG LABRADOR RETRIV FEMALE BLK WHT NaN NaN 40229 NaN BODY NaN NaN NaN NaN NaN
9002 2017-09-07 00:00:00 DOG BOXER NaN BRN BLK NaN NaN 40229 NaN BODY NaN NaN NaN NaN NaN

9003 rows × 15 columns

2. Taming the Breed Names

The bite reports store breeds as inconsistent, often abbreviated uppercase strings (GERM SHEPHERD, CHICHAUHUA, WEINER DOG…). To join them with the breed-trait table we map every variant to a canonical breed name. Anything we cannot confidently match returns an empty string and is excluded from the analysis.

Code
def normalized_breed_bites(row):
    breed = ''
    if 'BreedIDDesc' in row:
        breed = row['BreedIDDesc']
    else:
        breed = row['Breed']
    match breed:
        case 'AAUST. TERR.' | 'Australian Terrier':
            return 'Australian Terrier'
        case 'AIREDALE TER.':
            return 'Airedale Terrier'
        case 'AKITA' | 'Akita':
            return 'Akita'
        case 'ALASK. MALAMUTE' | 'MALAMUTE':
            return 'Alaskan Malamute'
        case 'AM BULLDOG' | 'AMER. BULL DOG' | 'American Bulldog':
            return 'American Bulldog'
        case 'AM. ESKIMO' | 'American Eskimo Dog':
            return 'American Eskimo Dog'
        case 'AMER FOX HOUND':
            return 'American Foxhound'
        case 'BASANJI' | 'BESINJI' | 'Basenji':
            return 'Basenji'
        case 'BASSET HOUND' | 'Basset Hound':
            return 'Basset Hound'
        case 'BEAGLE' | 'Beagle':
            return 'Beagle'
        case 'BEARDED COLLIE':
            return 'Bearded Collie'
        case 'BERNESEN MT.':
            return 'Bernese Mountain Dog'
        case 'BICHON FRESE' | 'BICHON FRISE' | 'Bichon Frise':
            return 'Bichon Frise'
        case 'BLACK LAB' | 'LABRADOR RETRIV' | 'Labrador Retriever':
            return 'Labrador Retriever'
        case 'BLOOD HOUND' | 'Bloodhound':
            return 'Bloodhound'
        case 'BLUE HEELER' | 'CATTLE DOG' | 'HEELER' | 'RED HEELER' | 'Australian Cattle Dog' | 'Australian Stumpy Tail Cattle Dog':
            return 'Australian Cattle Dog'
        case 'BORDER COLLIE' | 'BORDER COLLIE M' | 'Border Collie':
            return 'Border Collie'
        case 'BOSTON TERRIER' | 'Boston Terrier':
            return 'Boston Terrier'
        case 'BOUVIER':
            return 'Bouvier des Flandres'
        case 'BOX TERRIER' | 'BOXER' | 'Boxer':
            return 'Boxer'
        case 'BRIARD':
            return 'Briard'
        case 'BRITNEY SPANIEL' | 'COCKER SPAINEL' | 'IRISH SPANIEL' | 'Cocker Spaniel' | 'English Springer Spaniel' | 'English Toy Spaniel' | 'Field Spaniel' | 'Irish Water Spaniel' | 'Welsh Springer Spaniel':
            return 'Spaniel'
        case 'BULL DOG' | 'ENGLISH BULLDOG' | 'FRENCH BULLDOG' | 'Bulldog' | 'French Bulldog' | 'Bull Terrier' | 'Staffordshire Bull Terrier':
            return 'Bulldog'
        case 'BULLMASTIFF':
            return 'Bullmastiff'
        case 'CANE CORSO' | 'Cane Corso':
            return 'Cane Corso'
        case 'CATAHOULA' | 'Catahoula Leopard Dog':
            return 'Catahoula Leopard Dog'
        case 'CHICHAUHUA' | 'Chihuahua':
            return 'Chihuahua'
        case 'CHOW CHOW':
            return 'Chow Chow'
        case 'COCKAPOO':
            return 'Cockapoo'
        case 'COLLIE' | 'Australian Shepherd':
            return 'Collie'
        case 'COON HOUND':
            return 'Coonhound'
        case 'CORGI' | 'WELSH CORGI' | 'Welsh Corgi' | 'Cardigan Welsh Corgi':
            return 'Welsh Corgi'
        case 'DACHSHUND' | 'DOTSON' | 'WEINER DOG' | 'Dachshund':
            return 'Dachshund'
        case 'DALMATIAN':
            return 'Dalmatian'
        case 'DOBERMAN' | 'Doberman Pinscher':
            return 'Doberman Pinscher'
        case 'ENG. MASTIFF' | 'MASTIF' | 'Mastiff':
            return 'Mastiff'
        case 'ENGLISH SETTER' | 'English Setter' | 'Gordon Setter' | 'Irish Setter' | 'IRISH SETTER':
            return 'Setter'
        case 'ENGLISH SHEPARD' | 'OLD ENG SHP DOG' | 'SHEEP DOG' | 'Old English Sheepdog' | 'Shetland Sheepdog' | 'SHELTIE' | 'Anatolian Shepherd' | 'Belgian Malinois' | 'Belgian Tervuren':
            return 'Sheepdog'
        case 'FOX TERRIER' | 'TOY FOX TERRIER' | 'Cairn Terrier' | 'Irish Terrier' | 'Jack Russell Terrier' | 'Kerry Blue Terrier' | 'LAKELAND TER.' | 'RAT TERRIER' | 'SCOTTISH TER.' | 'Welsh Terrier' | 'WESTIE' | 'YORKSHIRE TERRIER' | 'Australian Terrier' | 'Border Terrier' | 'West Highland White Terrier' | 'Yorkshire Terrier':
            return 'Terrier'
        case 'FOX TERRIER MIX':
            return 'Fox Terrier Mix'
        case 'GERM SHEPHERD' | 'German Shepherd':
            return 'German Shepherd'
        case 'GOLD RETRIEVER' | 'GOLDEN LAB' | 'Golden Retriever' | 'Chesapeake Bay Retriever' | 'Flat-Coated Retriever':
            return 'Golden Retriever'
        case 'GREAT DANE' | 'Great Dane':
            return 'Great Dane'
        case 'GREAT PYRENEESE' | 'PYRENES' | 'Great Pyrenees':
            return 'Great Pyrenees'
        case 'GREYHOUND' | 'Greyhound' | 'Italian Greyhound' | 'Scottish Deerhound' | 'Whippet':
            return 'Greyhound'
        case 'HAVANESE':
            return 'Havanese'
        case 'HUSKY' | 'SIBERAN HUSKY' | 'Siberian Husky':
            return 'Siberian Husky'
        case 'IRISH WOLFHOUND' | 'Irish Wolfhound':
            return 'Irish Wolfhound'
        case 'LHASA APSO' | 'Lhasa Apso':
            return 'Lhasa Apso'
        case 'MALTASE' | 'Maltese':
            return 'Maltese'
        case 'MIN PIN' | 'Miniature Pinscher':
            return 'Miniature Pinscher'
        case 'NEW FOUNDLAND' | 'Newfoundland':
            return 'Newfoundland'
        case 'OTHER':
            return 'Mixed Breed'
        case 'PEKINGESE' | 'Pekingese':
            return 'Pekingese'
        case 'PIT BULL':
            return 'American Pit Bull Terrier'
        case 'POMERANIAN' | 'Pomeranian':
            return 'Pomeranian'
        case 'POODLE' | 'STAN POODLE' | 'TOY POODLE' | 'Poodle' | 'Toy Poodle' | 'Miniature Poodle' | 'Standard Poodle':
            return 'Poodle'
        case 'PUG' | 'Pug':
            return 'Pug'
        case 'ROTTWEILER' | 'Rottweiler':
            return 'Rottweiler'
        case 'SAINT BERNARD' | 'ST BERNARD' | 'Saint Bernard':
            return 'Saint Bernard'
        case 'SAMOYED' | 'Samoyed':
            return 'Samoyed'
        case 'SCHNAUZER' | 'Miniature Schnauzer' | 'Standard Schnauzer' | 'Giant Schnauzer':
            return 'Schnauzer'
        case 'SHAR-PEI' | 'Shar Pei':
            return 'Shar-Pei'
        case 'SHIH TZU' | 'Shih Tzu':
            return 'Shih Tzu'
        case 'SPITZ' | 'Finnish Spitz':
            return 'Spitz'
        case 'WEIMARANER' | 'Weimaraner':
            return 'Weimaraner'
        case 'Pointer':
            return 'Pointer'
        case 'Papillon':
            return 'Papillon'
        case 'Affenpinscher':
            return 'Affenpinscher'
        case 'Brussels Griffon':
            return 'Brussels Griffon'
        case 'Chinese Crested':
            return 'Chinese Crested'
        case 'Coton de Tulear':
            return 'Coton de Tulear'
        case 'Australian Kelpie':
            return 'Australian Kelpie'
        case 'Borzoi':
            return 'Borzoi'
        case 'Saluki':
            return 'Saluki'
        case 'Harrier':
            return 'Harrier'
        case 'Japanese Chin':
            return 'Japanese Chin'
        case 'Keeshond':
            return 'Keeshond'
        case 'Kuvasz':
            return 'Kuvasz'
        case 'Vizsla':
            return 'Vizsla'
        case 'Wirehaired Pointing Griffon':
            return 'Wirehaired Pointing Griffon'
        case 'Xoloitzcuintli':
            return 'Xoloitzcuintli'
        case 'Africanis':
            return 'Africanis'
        case 'Miniature Shiba Inu':
            return 'Miniature Shiba Inu'
        case 'Pharaoh Hound':
            return 'Pharaoh Hound'
        case 'Rhodesian Ridgeback':
            return 'Rhodesian Ridgeback'
    return ""

With names normalized, we count total reported bites per breed.

Code
animal_bites['Times'] = 1
grouped_bites = animal_bites.groupby(['BreedIDDesc', 'GenderIDDesc', 'WhereBittenIDDesc']).agg({'Times': 'sum'})
grouped_bites.reset_index(inplace=True)
grouped_bites['NormalizedBreed'] = grouped_bites.apply(lambda x: normalized_breed_bites(x), axis=1)

# breeds the normalizer could not match come back as '' — exclude them
grouped_bites = grouped_bites[grouped_bites['NormalizedBreed'] != '']
grouped_bites = grouped_bites.groupby(['NormalizedBreed']).agg({'Times': 'sum'})
grouped_bites.reset_index(inplace=True)
grouped_bites
NormalizedBreed Times
0 Airedale Terrier 2
1 Akita 19
2 Alaskan Malamute 10
3 American Bulldog 42
4 American Eskimo Dog 4
... ... ...
61 Spaniel 32
62 Spitz 2
63 Terrier 81
64 Weimaraner 11
65 Welsh Corgi 15

66 rows × 2 columns

3. From Breeds to Traits

Each breed carries a list of character traits. We explode those lists so each trait maps to the set of breeds that share it, then attach each breed’s bite count. A trait’s score (Avg) is the mean of total reported bites across the breeds carrying that trait.

Code
dog_breeds['NormalizedBreed'] = dog_breeds.apply(lambda x: normalized_breed_bites(x), axis=1)


# dog_breeds = dog_breeds.reset_index()

def process_traits(row):
    traits = []
    for trait in row['Character Traits'].split(','):
        traits.append(trait.strip().lower())
    return traits


dog_breeds['Character Traits'] = dog_breeds.apply(lambda x: process_traits(x), axis=1)
unique_traits = dog_breeds['Character Traits'].explode().unique()

dog_breeds
Breed Country of Origin Fur Color Height (in) Color of Eyes Longevity (yrs) Character Traits Common Health Problems NormalizedBreed
0 Labrador Retriever Canada Yellow, Black, Chocolate 21-24 Brown 10-12 [loyal, friendly, intelligent, energetic, good... Hip dysplasia, obesity, ear infections Labrador Retriever
1 German Shepherd Germany Black, Tan 22-26 Brown 7-10 [loyal, intelligent, protective, confident, tr... Hip dysplasia, elbow dysplasia, pancreatitis German Shepherd
2 Bulldog England White, Red 12-16 Brown 8-10 [loyal, calm, gentle, brave] Skin allergies, respiratory issues, obesity Bulldog
3 Poodle France White, Black, Brown, Apricot 10-15 Brown, Blue 12-15 [intelligent, active, affectionate, hypoallerg... Hip dysplasia, epilepsy, bladder stones Poodle
4 Beagle England White, Tan, Red, Lemon 13-15 Brown 12-15 [curious, friendly, energetic, good-natured] Ear infections, hip dysplasia, epilepsy Beagle
... ... ... ... ... ... ... ... ... ...
112 Catahoula Leopard Dog United States Merle, Black 20-26 Brown 12-14 [intelligent, energetic, good-natured, loyal] Dental problems, eye issues, skin allergies Catahoula Leopard Dog
113 Cocker Spaniel England Black, Brown 14-15 Brown 12-15 [intelligent, energetic, playful, good-natured] Dental problems, eye issues, skin allergies Spaniel
114 Miniature Shiba Inu Japan Red, Sesame 13-16 Brown 12-15 [intelligent, energetic, playful, good-natured] Dental problems, eye issues, skin allergies Miniature Shiba Inu
115 Belgian Tervuren Belgium Fawn 22-26 Brown 12-14 [intelligent, energetic, good-natured, loyal] Dental problems, eye issues, skin allergies Sheepdog
116 Pharaoh Hound Malta Red 21-25 Brown 12-14 [intelligent, energetic, good-natured, loyal] Dental problems, eye issues, skin allergies Pharaoh Hound

117 rows × 9 columns

Code
conclusion_df = dog_breeds.explode('Character Traits').groupby(['Character Traits']).agg(
    {'NormalizedBreed': lambda x: list(set(x))})


def add_bites_per_breed(row):
    bites = []
    breeds = row['NormalizedBreed']
    for breed in breeds:
        bites.append(grouped_bites[grouped_bites['NormalizedBreed'] == breed]['Times'])
    return bites


conclusion_df['Bites Per Breed'] = conclusion_df.apply(lambda x: add_bites_per_breed(x), axis=1)

conclusion_df = conclusion_df.reset_index()

conclusion_df['Bites'] = [[] for _ in range(len(conclusion_df))]
# change the type to int64

all_bites = []
all_breeds = []
for i in range(len(conclusion_df)):
    bites = conclusion_df.iloc[i]['Bites Per Breed']
    new_bites = []
    new_breeds = []
    for b in range(len(bites)):
        if len(bites[b]) > 0:
            new_bites.append(bites[b].iloc[0])
            new_breeds.append(conclusion_df.iloc[i]['NormalizedBreed'][b])

    assert (len(new_bites) == len(new_breeds))
    all_bites.append(new_bites)
    all_breeds.append(new_breeds)

assert (len(all_bites) == len(all_breeds))
conclusion_df['Bites'] = all_bites
conclusion_df['Breeds With Bites'] = all_breeds
conclusion_df = conclusion_df[['Character Traits', 'Bites', 'Breeds With Bites']]

conclusion_df['Max'] = conclusion_df['Bites'].apply(lambda x: max(x))
conclusion_df['Min'] = conclusion_df['Bites'].apply(lambda x: min(x))
conclusion_df['Avg'] = conclusion_df['Bites'].apply(lambda x: sum(x) / len(x))


def get_dangerous_breed(row):
    for i in range(len(row['Bites'])):
        if row['Max'] == row['Bites'][i]:
            return row['Breeds With Bites'][i]


conclusion_df['Dangerous Breed'] = conclusion_df.apply(lambda x: get_dangerous_breed(x), axis=1)
conclusion_df
Character Traits Bites Breeds With Bites Max Min Avg Dangerous Breed
0 active [44] [Poodle] 44 44 44.000000 Poodle
1 affectionate [44, 52, 6, 9, 5, 99, 17, 32, 17, 21] [Poodle, Bulldog, Setter, Bichon Frise, Peking... 99 5 30.200000 Shih Tzu
2 athletic [7] [Greyhound] 7 7 7.000000 Greyhound
3 brave [52] [Bulldog] 52 52 52.000000 Bulldog
4 calm [52] [Bulldog] 52 52 52.000000 Bulldog
5 charming [17] [Pug] 17 17 17.000000 Pug
6 confident [135, 246, 64, 99] [Chihuahua, German Shepherd, Rottweiler, Shih ... 246 64 136.000000 German Shepherd
7 curious [74, 92] [Dachshund, Beagle] 92 74 83.000000 Beagle
8 energetic [11, 19, 135, 19, 23, 17, 35, 74, 3, 2, 4, 81,... [Collie, Sheepdog, Chihuahua, Australian Cattl... 225 1 39.216216 Labrador Retriever
9 friendly [225, 92, 51] [Labrador Retriever, Beagle, Golden Retriever] 225 51 122.666667 Labrador Retriever
10 gentle [28, 9, 19, 1, 1, 52, 26, 7, 9, 17] [Great Dane, Saint Bernard, Sheepdog, Irish Wo... 52 1 16.900000 Bulldog
11 good-natured [19, 11, 19, 99, 17, 28, 35, 1, 3, 2, 4, 81, 7... [Sheepdog, Collie, Australian Cattle Dog, Shih... 225 1 30.974359 Labrador Retriever
12 hypoallergenic [44] [Poodle] 44 44 44.000000 Poodle
13 independent [48, 10, 7, 5] [Siberian Husky, Shar-Pei, Greyhound, Pekingese] 48 5 17.500000 Siberian Husky
14 intelligent [44, 11, 19, 19, 23, 17, 28, 35, 1, 3, 2, 4, 8... [Poodle, Collie, Sheepdog, Australian Cattle D... 246 1 34.727273 German Shepherd
15 kind [51] [Golden Retriever] 51 51 51.000000 Golden Retriever
16 loyal [11, 19, 135, 19, 23, 17, 28, 64, 35, 74, 1, 4... [Collie, Sheepdog, Chihuahua, Australian Cattl... 246 1 46.517241 German Shepherd
17 patient [17] [Basset Hound] 17 17 17.000000 Basset Hound
18 playful [11, 19, 19, 99, 17, 74, 3, 2, 4, 81, 72, 5, 3... [Collie, Sheepdog, Australian Cattle Dog, Shih... 149 2 34.925926 Boxer
19 protective [64, 9, 19, 246, 26, 9, 23, 149] [Rottweiler, Saint Bernard, Sheepdog, German S... 246 9 68.125000 German Shepherd
20 sensitive [135, 7] [Chihuahua, Greyhound] 135 7 71.000000 Chihuahua
21 social [9, 17] [Bichon Frise, Pug] 17 9 13.000000 Pug
22 strong [48, 64] [Siberian Husky, Rottweiler] 64 48 56.000000 Rottweiler
23 trainable [246, 11] [German Shepherd, Collie] 246 11 128.500000 German Shepherd

4. Ranking the Traits

We rank traits by that average and keep the three highest (“most bites”) and three lowest (“least bites”).

Code
trait_vs_bite = conclusion_df[['Character Traits', 'Avg']]

trait_vs_bite.plot.bar(x='Character Traits', y='Avg', figsize=(10, 5), rot=90)

top_3_most_dangerous = trait_vs_bite.sort_values(by='Avg', ascending=False).head(3)
top_3_most_safe = trait_vs_bite.sort_values(by='Avg', ascending=True).head(3)

For each of those traits, we break the total back down by breed — this is what the stacked bars below show.

Code
top_3_most_dangerous['Breeds'] = top_3_most_dangerous.apply(
    lambda x: conclusion_df.loc[conclusion_df['Character Traits'] == x['Character Traits']]['Breeds With Bites'].iloc[
        0], axis=1)
top_3_most_dangerous = top_3_most_dangerous.explode('Breeds')
top_3_most_safe['Breeds'] = top_3_most_safe.apply(
    lambda x: conclusion_df.loc[conclusion_df['Character Traits'] == x['Character Traits']]['Breeds With Bites'].iloc[0],
    axis=1
)
top_3_most_safe = top_3_most_safe.explode('Breeds')


def find_bite_from_breed(row):
    breed = row['Breeds']
    print(breed)
    details = grouped_bites.loc[grouped_bites['NormalizedBreed'] == breed]
    print(details)

    return details

top_3_most_dangerous
Character Traits Avg Breeds
6 confident 136.000000 Chihuahua
6 confident 136.000000 German Shepherd
6 confident 136.000000 Rottweiler
6 confident 136.000000 Shih Tzu
23 trainable 128.500000 German Shepherd
23 trainable 128.500000 Collie
9 friendly 122.666667 Labrador Retriever
9 friendly 122.666667 Beagle
9 friendly 122.666667 Golden Retriever
Code
def insert_bite_from_breed(row):
    breed = row['Breeds']
    details = grouped_bites.loc[grouped_bites['NormalizedBreed'] == breed]
    return details['Times'].iloc[0]

top_3_most_dangerous['Bites'] = top_3_most_dangerous.apply(lambda x: insert_bite_from_breed(x), axis=1)
top_3_most_safe['Bites'] = top_3_most_safe.apply(lambda x: insert_bite_from_breed(x), axis=1)
top_3_most_dangerous
Character Traits Avg Breeds Bites
6 confident 136.000000 Chihuahua 135
6 confident 136.000000 German Shepherd 246
6 confident 136.000000 Rottweiler 64
6 confident 136.000000 Shih Tzu 99
23 trainable 128.500000 German Shepherd 246
23 trainable 128.500000 Collie 11
9 friendly 122.666667 Labrador Retriever 225
9 friendly 122.666667 Beagle 92
9 friendly 122.666667 Golden Retriever 51

5. The Charts

Code
import seaborn as sns
import matplotlib.pyplot as plt
plot_danger_data = top_3_most_dangerous.pivot(index='Character Traits', columns='Breeds', values='Bites')



sns.set_style("whitegrid")
sns.set_palette("tab20")
def plot_stacked_bar(data, title, fname):
    ax = data.plot(kind='bar', stacked=True, figsize=(10, 10), width=0.75)

    ax.grid(False)                    # Remove grid
    ax.spines['top'].set_visible(False)     # Remove top axis line
    ax.spines['right'].set_visible(False)   # Remove right axis line
    ax.spines['left'].set_visible(False)    # Remove left axis line
    ax.spines['bottom'].set_visible(False)  # Remove bottom axis line
    ax.set_yticks([])
    plt.title(title, fontsize=16, fontweight='bold')
    plt.xlabel('Character Traits')
    plt.ylabel('Bites')
    plt.xticks(rotation=0)
    plt.legend(title='Categories')

    # Add total values on top of bars
    totals = data.sum(axis=1)  # Calculate totals for each character trait
    for i, (trait, total) in enumerate(totals.items()):

        ax.text(i, total + 0.5, str(int(total)),
                ha='center',
                va='bottom',
                fontweight='bold',
                fontsize=14,
                color='darkblue',
                # bbox=dict(boxstyle="round,pad=0.3", facecolor="white", alpha=0.8)
                )

    # Add individual values inside each stack segment
    cumulative = data.cumsum(axis=1)  # Calculate cumulative sums for positioning


    for i, trait in enumerate(data.index):
        for j, col in enumerate(data.columns):
            value = data.loc[trait, col]
            if value > 0:
                # Position text in the middle of each segment using cumulative values
                cum_value = cumulative.loc[trait, col]
                y_pos = cum_value - (value / 2)  # Middle of the current segment

                ax.text(i, y_pos, str(int(value)),
                        ha='center', va='center', fontweight='bold',
                        color='white' if value > 3 else 'black',
                        fontsize=9)

    plt.tight_layout()

    plt.savefig(f'charts/{fname}', dpi=400, bbox_inches='tight')
    plt.show()

plot_stacked_bar(plot_danger_data, 'Top 3 Dog Traits with Most Bites', 'top-traits-most-bites.png')
plot_safe_data = top_3_most_safe.pivot(index='Character Traits', columns='Breeds', values='Bites')
plot_stacked_bar(plot_safe_data, 'Top 3 Dog Traits with Least Bites', 'top-traits-least-bites.png')

6. The Catch: Base Rates

Taken at face value, the charts say friendly, confident, trainable dogs are the biters, and the intimidating breeds are safe. That reading is wrong, and the reason matters more than the ranking:

  1. These are counts, not rates. Labradors and German Shepherds are among the most popular dogs in America. More dogs → more encounters with people → more reported bites, regardless of temperament. A trait carried by popular family breeds inherits their exposure.
  2. The denominator is missing. To say anything about a breed’s propensity to bite we would need bites per registered dog (licensing or population data), which this dataset does not contain.
  3. Selection effects run the other way, too. Breeds with an aggressive reputation are rarer, more regulated, and their owners are often more cautious — all of which suppresses their absolute bite counts.

What the data does support

  • Most reported bites come from popular, “friendly” family breeds — so a friendly reputation is no reason to skip supervision, especially around children.
  • Reported-bite counts are dominated by exposure, not temperament. Any “dangerous breeds” claim built on raw counts — in either direction — should be treated with suspicion.

Next steps

Bringing in breed licensing counts for the same county would turn these counts into rates and make a temperament comparison honest. Until then, the headline “aggressive dogs don’t bite” is exactly as true as the data is naive.