Blog

Generative AI and STEM

Background

Artificial intelligence is not new. It has been part of our personal and work lives for a long time (autocorrect, facial recognition, satnav, etc.) and large language models like ChatGPT have been a big topic in education since version 3.5 was released in late November, 2022. Large language models (LLMs) are trained on enormous amounts of data in order to recognize the patterns of and connections between words, and then produce text based on the probabilities of which word is most likely to come next. One thing that LLMs don’t do, however, is computation; however, the most recent Open AI release, GPT-4, seems to have made strides in standardized tests in many STEM areas and GPT-4 now has a plug-in for Wolfram Alpha, which does do computation.

Chart from Open AI: exam result improvements from ChatGPT 3.5 to 4

Andrew Roberts (Math Dept) and Susan Bonham (EdTech) did some testing to see how ChatGPT (3.5), GPT-4, and GPT-4 with the Wolfram plugin would handle some questions from Langara’s math courses.

Test Details

Full test results are available. (accessible version of the problems and full details of the “chats” and subsequent discussion for each AI response are available at the link)

The following questions were tested:

 

Problem 1: (supplied by Langara mathematics instructor Vijay Singh)

 

Problem 2: (Precalculus)

 

Problem 3: (Calculus I)

 

Problem 4: (Calculus II)

 

Discussion

Responses from current versions of ChatGPT are not reliable enough to be accepted uncritically.

ChatGPT needs to be approached as a tool and careful proof-reading of responses is needed to check for errors in computation or reasoning. Errors may be blatant and readily apparent, or subtle and hard to spot without close reading and a solid understanding of the concepts.

Perhaps the biggest danger for a student learning a subject is in the “plausibility” of many responses even when they are incorrect. ChatGPT will present its responses with full confidence in their correctness, whether this is justified or not.

When errors or lack of clarity is noticed in a response, further prompting needs to be used to correct and refine the initial response. This requires a certain amount of base knowledge on the part of the user in order to guide ChatGPT to the correct solution.

Algebraic computations cannot be trusted as ChatGPT does not “know” the rules of algebra but is simply appending steps based on a probabilistic machine-learning model that references the material on which it was trained. The quality of the answers will depend on the quality of the content on which ChatGPT was trained. There is no way for us to know exactly what training material ChatGPT is referencing when generating its responses. The average quality of solutions sourced online should give us pause.

Below is one especially concerning example of an error encountered during our testing sessions:

In the response to the optimization problem (Problem 3), GPT-3.5 attempts to differentiate the volume function:

However, the derivative is computed as:

We see that it has incorrectly differentiated the first term with respect to R while correctly differentiating the second term with respect to h.

It is the plausibility of the above solution (despite the bad error) that is dangerous for a student who may take the ChatGPT response at face value.

Access to the Wolfram plugin in GPT-4 should mean that algebraic computations that occur within requests sent to Wolfram can be trusted. But the issues of errors in reasoning and interpretation still exist between requests sent to Wolfram.

Concluding Thought

It will be important for us educate our students about the dangers involved in using this tool uncritically while acknowledging the potential benefits if used correctly.

Want to Learn More?

EdTech and TCDC run workshops on various AI topics. You can request a bespoke AI workshop tailored to your department or check out the EdTech and TCDC workshop offerings. For all other questions, please contact edtech@langara.ca

Brightspace Accessibility in Five, Bonus: Accessible Uploads

Brightspace plus accessibility logoBrightspace is an excellent tool to provide equitable, inclusive access to course content, documents, and media. 

As you create content, take advantage of Brightspace’s built-in tools and the Accessibility Checker to ensure what you share is accessible. Accessible content is inclusive, democratic, and maximizes learner independence. 

However, Brightspace is also a good tool to distribute other material such as lecture slides and documents. It is important that that material also be accessible. 

Creating accessible Word and PowerPoint documents is straightforward. Ensuring a PDF is accessible requires additional time and understanding of unique tools and code. 

The best practices (link text, colour contrast, headings, tables, and text equivalents) listed in this series apply to documents of all types. The process to ensure accessibility is slightly different depending on software.  

Microsoft Office Files

Word and PowerPoint have a built-in accessibility checker. To use this tool: 

  1. Navigate to Review 
  2. Select Check Accessibility 

Read more about making Office documents accessible.

PDF

To make accessible PDFs, it is best practice to make a Word or PowerPoint presentation accessible and then export to PDF. Adobe Acrobat Pro is required to ensure your PDFs are accessible. Try to avoid PDFs for content, except for forms and content to specifically be printed directly. For more information on making PDFs accessible, consult Langara’s Accessibility Handbook for Teaching and Learning. 

docReader

Brightspace now features the docReader tool. When a Word, PowerPoint, or PDF file is uploaded a Brightspace course, students will be able to have them read aloud using the Open with docReader button below the document viewer pane.

This tool does not absolve content creators of generating accessible content. This tool will not be able to read inaccessible documents.


Check out the other posts in the Brightspace Accessibility in Five series:

  1. Link Text
  2. Colour
  3. Headings
  4. Tables
  5. Text Equivalents

Brightspace Accessibility in Five, 5: Text Equivalents

Brightspace plus accessibility logo

Brightspace is an excellent tool to provide equitable, inclusive access to course content, documents, and media.

As you create content, take advantage of Brightspace’s built-in tools and the Accessibility Checker to ensure what you share is accessible. Accessible content is inclusive, democratic, and maximizes learner independence.

In the fifth of this five-part series, we will learn about text equivalents (alternative text and closed captions).

Alternative Text

Alternative text explains the content and context of an image to screen reader users. To write effective alternative text, consider how you would describe the graphic to a friend over the phone. Try to include all relevant information using proper grammar in less than 120 characters. Learn more in the Langara Accessibility Handbook alternative text chapter. 

Images may be marked as decorative if they are only included for visual effect or if the information in the image is also present in text adjacent to the graphic. 

When uploading a new image, Brightspace automatically prompts for alternative text. Enter a description in the Alternative Text field or check This image is decorative. Screenshot of Brightspace provide alternative text prompt

To add alternative text to existing images: 

  1. Select an existing image and choose Image options
  2. Check Image is decorative or enter a description in the Alternative description field.

Closed Captions

Captions provide a text equivalent of all audio elements in a video, presented visually in time with the video. Closed captions can be toggled on or off by the viewer. Open captions are ‘burned’ into the video and cannot be turned off.

Traditionally, we think of captions as an accommodation for viewers who cannot hear the audio in a video due to hearing loss. Statistics suggest 4-5% of the general population suffer some form of hearing loss. That number increases to around 20% for people over aged 60. However, 80% of 18 to 25-year-olds regularly use captions when watching video.

Captions are not just an accessibility essential, but also an excellent universal design for learning tool.

All new uploads to Kaltura (either in Brightspace My Media or Mediaspace) will have machine-generated captions automatically ordered. Videos added to OneDrive/SharePoint can have machine-generated captions ordered manually. Machine-generated captions are not accurate enough and must be edited. To learn more about captioning, read the Captions and Transcripts chapter of the Langara Accessibility Handbook.

Accessibility Checker

Brightspace includes a built-in accessibility checker. The checker appears on the second row of the editor toolbar.

  1. Select More Actions to reveal the second row of the toolbar
  2. Select Accessibility Checker

The accessibility checker will highlight many accessibility issues and offer solutions to correct them. However, the checker tool will not check videos for captions. This must be verified manually.


Check out the other posts in the Brightspace Accessibility in Five series:

  1. Link Text
  2. Colour
  3. Headings
  4. Tables
  5. Text Equivalents
  6. Bonus: Accessible Uploads

A.I. Detection: A Better Approach 

Over the past few months, EdTech has shared concerns about A.I. classifiers, such as Turnitin’s A.I. detection tool, AI Text Classifier, GPTZero, and ZeroGPT. Both in-house testing and statements from Turnitin and OpenAI confirm that A.I. text classifiers unreliably differentiate between A.I. and human generated writing. Given that the tools are unreliable and easy to manipulate, EdTech discourages their use. Instead, we suggest using Turnitin’s Similarity Report to help identify A.I.-hallucinated and fabricated references.  

What is Turnitin’s Similarity Report 

The Turnitin Similarity Report quantifies how similar a submitted work is to other pieces of writing, including works on the Internet and those stored in Turnitin’s extensive database, highlighting sections that match existing sources. The similarity score represents the percentage of writing that is similar to other works. 

AI Generated References 

A.I. researchers call the tendency of A.I. to make stuff up a “hallucination.” A.I.-generated responses can appear convincing, but include irrelevant, nonsensical, or factually incorrect answers.  

ChatGPT and other natural language processing programs do a poor job of referencing sources, and often fabricating plausible references. Because the references seem real, students often mistake them as legitimate. 

Common reference or citation errors include: 

  • Failure to include a Digital Object Identifier (DOI) or incorrect DOI 
  • Misidentification of source information, such as journal or book title 
  • Incorrect publication dates 
  • Incorrect author information 

Using Turnitin to Identify Hallucinated References 

To use Turnitin to identify hallucinated or fabricated references, do not exclude quotes and bibliographic material from the Similarity Report. Quotes and bibliographic information will be flagged as matching or highly similar to source-based evidence. Fabricated quotes, references, and bibliographic information will have zero similarity because they will not match source-based evidence.

Quotes and bibliographic information with no similarity to existing works should be investigated to confirm that they are fabricated.  

References

Athaluri S, Manthena S, Kesapragada V, et al. (2023). Exploring the boundaries of reality: Investigating the phenomenon of artificial intelligence hallucination in scientific writing through ChatGPT references. Cureus 15(4): e37432. doi:10.7759/cureus.37432 

Metz, C. (2023, March 29). What makes A.I. chatbots go wrong? The curious case of the hallucinating software. New York Times. https://www.nytimes.com/2023/03/29/technology/ai-chatbots-hallucinations.html 

Aligning language models to follow instructions. (2022, January 27). OpenAI. https://openai.com/research/instruction-following 

Weise, K., and Metz, C. (2023, May 1). What A.I. chatbots hallucinate. New York Times. https://www.nytimes.com/2023/05/01/business/ai-chatbots-hallucination.html 

Welborn, A. (2023, March 9). ChatGPT and fake citations. Duke University Libraries. https://blogs.library.duke.edu/blog/2023/03/09/chatgpt-and-fake-citations/ 

screenshot of a Turnitin Similarity Report, with submitted text on the left and the report panel on the right

AccessAbility Week at Langara

AccessAbility Week, starting on Sunday, May 28, is an excellent opportunity to:

  • Acknowledge and celebrate individuals with disabilities.
  • Advance and emphasize the ongoing efforts to reduce barriers.
  • Reflect upon and acknowledge the progress made in fostering inclusivity and accessibility.

The goal of accessibility is to ensure that everyone can participate fully in their communities. Accessibility is not an accommodation. Accessibility is not about making space for any one person; it’s about building environments that are more inclusive and easier to access for anyone.

While accessibility is crucial for people with disabilities, efforts to reduce barriers benefit everyone. Barriers hinder people from being included, accessing information, and participating fully. Some examples of barriers are:

  • Doors without automatic openers.
  • Poorly lit rooms.
  • Inaccessible digital documents.
  • Websites that are hard to use.
  • Attitudes towards disabilities and accessibility.

Reducing barriers increases diversity, inclusion, and independence for everyone.

In EdTech, we are committed to developing resources that enhance accessibility across Langara’s digital environments. This includes promoting Brightspace best practices, improving captions in Kaltura MediaSpace, and creating resources to develop accessible content in Langara’s core technologies.

Explore our digital accessibility resources and email assistivetech@langara.ca to learn more.

In addition to AccessAbility Week, Global Accessibility Awareness Day (GAAD) occurs on May 18. Numerous GAAD events are scheduled, offering excellent opportunities to learn more about accessibility.

As an introduction to digital accessibility, consider this brief presentation:

Please contact assistivetech@langara.ca for more information.

Brightspace Accessibility in Five, 4: Tables

Brightspace plus accessibility logo

Brightspace is an excellent tool to provide equitable, inclusive access to course content, documents, and media.

As you create content, take advantage of Brightspace’s built-in tools and the Accessibility Checker to ensure what you share is accessible. Accessible content is inclusive, democratic, and maximizes learner independence.

In the fourth of this five-part series, we will learn about tables.

Tables

Tables should only be used to present data, not for layout or formatting. Include a header row and/or column and avoid blank, split, and merged cells.

Use the Table tool to insert and modify tables:

Use Table Properties for advanced settings such as style, padding, and formatting.

To set header rows:

  1. Select a cell in the row to be made a header
  2. Open the Table menu and choose Cell Properties
  3. Change Row type to Header and click Save

Do not add images of tables. If you must, ensure the image has alternative text that accurately conveys the table data.

Accessibility Checker

Brightspace includes a built-in accessibility checker. The checker appears on the second row of the editor toolbar.

  1. Select More Actions to reveal the second row of the toolbar
  2. Select Accessibility Checker

The accessibility checker will highlight many accessibility issues and offer solutions to correct them. For tables, the accessibility checker will flag tables without header rows or columns. The checker will also note tables without a caption and suggest users add a summary to long or complex tables.


Watch for more posts in the Brightspace Accessibility in Five series coming soon, including:

  1. Link Text
  2. Colour
  3. Headings
  4. Tables
  5. Text Equivalents
  6. Bonus: Accessible Uploads

AI Classifiers — What’s the problem with detection tools?

AI classifiers don’t work!

Natural language processor AIs are meant to be convincing. They are creating content that “sounds plausible because it’s all derived from things that humans have said” (Marcus, 2023). The intent is to produce outputs that mimic human writing. The result: The world’s leading AI companies can’t reliably distinguish the products of their own machines from the work of humans.

In January, OpenAI released its own AI text classifier. According to OpenAI “Our classifier is not fully reliable. In our evaluations on a “challenge set” of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as “likely AI-written,” while incorrectly labeling human-written text as AI-written 9% of the time (false positives).”

A bit about how AI classifiers identify AI-generated content

GPTZero, a commonly used detection tool, identifies AI created works based on two factors: perplexity and burstiness.

Perplexity measures the complexity of text. Classifiers identify text that is predictable and lacking complexity as AI-generated and highly complex text as human-generated.

Burstiness compares variation between sentences. It measures how predictable a piece of content is by the homogeneity of the length and structure of sentences throughout the text. Human writing tends to be variable, switching between long and complex sentences and short, simpler ones. AI sentences tend to be more uniform with less creative variability.

The lower the perplexity and burstiness score, the more likely it is that text is AI generated.

Turnitin is a plagiarism-prevention tool that helps check the originality of student writing. On April 4th, Turnitin released an AI-detection feature.

According to Turnitin, its detection tool works a bit differently.

When a paper is submitted to Turnitin, the submission is first broken into segments of text that are roughly a few hundred words (about five to ten sentences). Those segments are then overlapped with each other to capture each sentence in context.

The segments are run against our AI detection model, and we give each sentence a score between 0 and 1 to determine whether it is written by a human or by AI. If our model determines that a sentence was not generated by AI, it will receive a score of 0. If it determines the entirety of the sentence was generated by AI it will receive a score of 1.

Using the average scores of all the segments within the document, the model then generates an overall prediction of how much text (with 98% confidence based on data that was collected and verified in our AI innovation lab) in the submission we believe has been generated by AI. For example, when we say that 40% of the overall text has been AI-generated, we’re 98% confident that is the case.

Currently, Turnitin’s AI writing detection model is trained to detect content from the GPT-3 and GPT-3.5 language models, which includes ChatGPT. Because the writing characteristics of GPT-4 are consistent with earlier model versions, our detector is able to detect content from GPT-4 (ChatGPT Plus) most of the time. We are actively working on expanding our model to enable us to better detect content from other AI language models.

The Issues

AI detectors cannot prove conclusively if text is AI generated. With minimal editing, AI-generated content evades detection.

L2 writers tend to write with less “burstiness.” Concern about bias is one of the reasons for UBC chose not to enable Turnitins’ AI-detection feature.

ChatGPT’s writing style may be less easy to spot than some think.

Privacy violations are a concern with both generators and detectors as both collect data.

Now what?

Langara’s EdTech, TCDC, and SCAI departments are working together to offer workshops on four potential approaches: Embrace it, Neutralize it, Ban it, Ignore it. Interested in a bespoke workshop for your department? Complete the request form.


References
Marcus, G. (2023, January 6). Ezra Klein interviews Gary Marcus [Audio podcast episode]. In The Ezra Klein Show. https://www.nytimes.com/2023/01/06/podcasts/transcript-ezra-klein-interviews-gary-marcus.html

Fowler, G.A. (2023, April 3). We tested a new ChatGPT-detector for teachers. If flagged an innocent student. Washington Post. https://www.washingtonpost.com/technology/2023/04/01/chatgpt-cheating-detection-Turnitin/

AI Detection Tool Testing — Initial Results

We’ve limited our testing to Turnitin’s AI detection tool. Why? Turnitin has undergone privacy and risk reviews and is a college-approved technology. Other detection tools haven’t been reviewed and may not meet recommended data privacy standards.

What We’ve Learned So Far

  • Unedited AI-generated content often receives a 100% AI-generated score, although more complex writing by ChatGPT4 can score far less than 100%.
  • Adding typos and grammar mistakes or prompting the AI generator to include errors throughout a document canchange the AI-generated score from 100% to 0%. 
  • Adding I-statements throughout a document has a dramatic impact in lowering the AI score. 
  • Interrupting the flow of text by replacing one word every couple of sentences with a less likely word, increases the perplexity of the wording and lowers the AI-generated percentage.  AI text generators act like text predictors, creating text by adding the most likely next work. If the detector is perplexed by a word because the word is not the most likely choice, then it’s determined to be human written.
  • Unlike human-generated writing, AI sentences tend to be uniform. Changing the length of sentences throughout a document, making some sentences shorter and others longer and more complex, alters the burstiness and lowers the generated-by-AI score. 
  • By replacing one or two words per paragraph and modifying the length of sentences here and there throughout a chunk of text — i.e. by doing minor tweaks of both perplexity and burstiness — the AI-generated score changes from 100% to 0%. 

To learn more about how AI detection tools work, read AI Classifiers — What’s the problem with detection tools?

AI tools & privacy

ChatGPT is underpinned by a large language model that requires massive amounts of data to function and improve. The more data the model is trained on, the better it gets at detecting patterns, anticipating what will come next and generating plausible text.

Uri Gal notes the following privacy concerns in The Conversation:

  • None of us were asked whether OpenAI could use our data. This is a clear violation of privacy, especially when data are sensitive and can be used to identify us, our family members, or our location.
  • Even when data are publicly available their use can breach what we call contextual integrity. This is a fundamental principle in legal discussions of privacy. It requires that individuals’ information is not revealed outside of the context in which it was originally produced.
  • OpenAI offers no procedures for individuals to check whether the company stores their personal information, or to request it be deleted. This is a guaranteed right in accordance with the European General Data Protection Regulation (GDPR) – although it’s still under debate whether ChatGPT is compliant with GDPR requirements.
  • This “right to be forgotten” is particularly important in cases where the information is inaccurate or misleading, which seems to be a regular occurrencewith ChatGPT.
  • Moreover, the scraped data ChatGPT was trained on can be proprietary or copyrighted.

When we use AI tools, including detection tools, we are feeding data into these systems. It is important that we understand our obligations and risks.

When an assignment is submitted to Turnitin, the student’s work is saved as part of Turnitin’s database of more than 1 billion student papers. This raises privacy concerns that include:

  • Students’ inability to remove their work from the database
  • The indefinite length of time that papers are stored
  • Access to the content of the papers, especially personal data or sensitive content, including potential security breaches of the server

AI detection tools, including Turnitin, should not be used without students’ knowledge and consent. While Turnitin is a college-approved tool, using it without students’ consent poses a copyright risk (Strawczynski, 2004).  Other AI detection tools have not undergone privacy and risk assessments by our institution and present potential data privacy and copyright risks.

For more information, see our Guidelines for Using Turnitin.

Getting Started with ChatGPT

Tips for writing effective prompts

Prompt-crafting takes practice:

  • Focus on tasks where you are an expert & get GPT to help.
  • Give the AI context.
  • Give it step-by-step directions.
  • Get an initial answer. Ask for changes and edits.

Provide as much context as possible and use specific and detailed language. You can include information about:

  • Your desired focus, format, style, intended audience and text length.
  • A list of points you want addressed.
  • What perspective you want the text written from, if applicable.
  • Specific requirements, such as no jargon.

Try an iterative approach

Ethan Mollick offers the following:

  • The best way to use AI systems is not to craft the perfect prompt, but rather to use it interactively. Try asking for something. Then ask the AI to modify or adjust its output. Work with the AI, rather than trying to issue a single command that does everything you want. The more you experiment, the better off you are. Just use the AI a lot, and it will make a big difference – a lesson my class learned as they worked with the AI to create essays.
  • More elaborate and specific prompts work better.
  • Don’t ask it to write an essay about how human error causes catastrophes. The AI will come up with a boring and straightforward piece that does the minimum possible to satisfy your simple demand. Instead, remember you are the expert, and the AI is a tool to help you write. You should push it in the direction you want. For example, provide clear bullet points to your argument: write an essay with the following points: -Humans are prone to error -Most errors are not that important -In complex systems, some errors are catastrophic -Catastrophes cannot be avoided.
  • But even these results are much less interesting than a more complicated prompt: write an essay with the following points. use an academic tone. use at least one clear example. make it concise. write for a well-informed audience. use a style like the New Yorker. make it at least 7 paragraphs. vary the language in each one. end with an ominous note. -Humans are prone to error -Most errors are not that important -In complex systems, some errors are catastrophic -Catastrophes cannot be avoided
  • Try asking for it to be conciseor wordy or detailed, or ask it to be specific or to give examples. Ask it to write in a tone (ominous, academic, straightforward) or to a particular audience (professional, student) or in the style of a particular author or publication (New York Times, tabloid news, academic journal). You are not going to get perfect results, so experimenting (and using the little “regenerate response” button) will help you get to the right place. Over time, you will start to learn the “language” that ChatGPT is using.

Get ChatGPT to ask you questions

Instead of coming up with your own prompts, try getting the AI to ask you questions to get the information it needs. In a recent Twitter post, Ethan Mollick notes that this approach produced surprisingly good results.

Ideas for using ChatGPT with students

For lots of great ideas and advice, watch Unlocking the Power of AI: How Tools Like ChatGPT Can Make Teaching Easier and More Effective.

  • Use it to create counterarguments to students work. Students can use the AI output to further refine their arguments and help them clarify their positions.
  • Use it to write something for different audiences and have students compare the output and identify how writing changes for a general versus expert audience.
  • Use ChatGPT for a first draft and then have students edit a second draft with critiques, corrections, and additions.
  • Use it to start a discussion. For example, ask ChatGPT why one theory is better than another. Then, ask again why the second theory is better.
  • Use it to generate a list of common misconceptions and then have students address them.
  • Ask students to generate a ChatGPT response to a question of their own choosing, and then write an analysis of the strengths and weaknesses of the ChatGPT response.

Some ways you can use ChatGPT

  • Use it to create a bank of multiple choice and short-answer questions for formative assessment. It can also pre-generate sample responses and feedback.
  • Use it to create examples.
  • Use it to generate ten prompts for a class discussion.

Further reading and resources

Heaven, W.D. (2023, April 6). ChatGPT is going to change education, not destroy it. MIT Technology Review.

Liu, D. et al (2023). How AI can be used meaningfully by teachers and students in 2023. Teaching@Sydney.

Mollick, E. R., & Mollick, L. (2022). New modes of learning enabled by AI Chatbots: Three methods and assignments. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4300783

Rudolph, J. et al (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching. Vol. 6, No. 1.